• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

PSA: Don't let your circuit breakers trip on your critical equipment

Mark R

Diamond Member
So, there's so power maintenance work going on at work this weekend. We've got a skeleton staff, but the work still needs to be done, on time. The engineers are installing 'power optimization units' (whatever the heck those are) that will provide 'signficiant financial and efficiency benefits'.

So, to do this, they will disconnect the power to a wing of the building and switch it over to generators within 10 seconds.

This happens, and all the UPSs kick in on all the servers, LAN infrastructure, key workstations and key equipment. Except that when the power comes back, about half the circuit breakers trip straight off due to the 'turn on surge'.

We lose 1 piece of key equipment, as its breaker goes out, and about 1/2 the PCs and most of the workstations. Things work for about 10 mins, and slowly the LAN starts falling apart as UPS after UPS runs out of batteries. Then the workstations start suddenly powering off, losing work and irreplaceable data - their UPSs had run out of batteries, and there was no auto-shutdown configured.

After about 15 minutes, we've got about 2 PCs and 1 workstation that still have LAN access to the main campus.

Oh, but wait. There's another problem, the AC is out and it's getting sweltering - but more importantly, the equipment chillers are out, and the server room AC is out.

After 40 minutes of hard work, one of our key bits of equipment starts complaining it's overheating and shuts down because it's lost it's chilled water supply. (There is a backup machine, but it's got no power because the breakers have tripped out).

Of course, all this was reported to maintenance department as soon as the power didn't come back up - but they didn't believe it, so didn't send anyone one. It's only once we'd lost all our key equipment, that they thought there may be something up.

Moral of this story:
1. There's no point in having UPSs and stuff, if you can't reliably transition from mains to gen or back, because your circuit breakers aren't appropriately rated.
2. If you've got UPSs, make sure they're configured to perform a clean shutdown, otherwise you'll end up with hosed data and corrupted systems.
3. Don't out-source your buildings maintenance to the lowest bidder
4. WTF is a power optimizer anyway?
4b. If it ain't broke, don't fix it.
 
Wow... that's some epic fail...

Seems like those "power optimization units" brought the wrong kind of financial impact... also not very efficient if all the equipment that's supposed to run on them are dead...
 
Where was the backup generator? If this was suppose to happen in like a minute or two, why not just shut everything down and then bring it back up??? Why do people purposely stress out an electrical system?

You need better electrical engineers. Your place probably needs to be rewired for more excess capacity. A lot of ciruit breakers were probably old and overstressed to begin with. I like when the women plug in their space heaters and all the computers go out.
 
Last edited:
Got a bit more info:

It was actually one 1 circuit breaker that tripped out - on one final circuit with quite a few PCs on it. There were loads of PCs connected to a few receptacles by a maze of power strips - so it was probably just too many PCs on that circuit.

The chiller unit had a problem, in that it doesn't power-up automatically when power is restored, and must be manually started after a power interruption. Departmental staff knew this, but building services didn't, so ignored calls from staff to start the chiller, until $1million worth of mission-critical precision equipment shutdown due to a temperature warning.

The rest of the equipment (most workstations, PCs, other equipment, HVAC) would never have powered up, as they weren't connected to the generator backup supply. The generators were only specified to provide about 30% of normal load - and due to a massive increase in computerisation in the last few years, this proportion had fallen significantly.

It looks as though the main problem was that all the work had been computerised, but there hadn't been enough thought into what equipment was 'mission critical' and needed to be on the generator. The result was that large chunks of the LAN went down because they weren't on generator power, leaving half of the few PCs/workstations that were rated 'mission critical' with power but no LAN.

It also appears there wasn't much thought going into what equipment should have UPSs. A lot of specialist equipment was equipped with UPSs to prevent data loss, but the UPSs themselves were not supplied from a generator circuit, nor were the devices configure to auto-shutdown. As a result, all that happened was that the devices were uncleanly shutdown, just after a 10 minute delay.

Thankfully, as it was the weekend, most of this equipment wasn't in use at the time, so didn't lose any data. The only data that was lost was from our big machine, when the power transferred to generator - but it isn't on a UPS. Management balked when they saw the cost of a UPS capable of handling it's 3-phase 150 kVA demand and decided to omit the UPS. (Strictly speaking, the entire machine doesn't need a UPS, though it would be nice - a UPS on the data-acquistion and control computers would have been sufficient, and 5 kVA would have been sufficient. You'd have thought that when spending $500k on a machine, you'd pony up the $10k for a UPS to protect its CPU and irreplaceable data, but hey, I'm not a manager, so I don't understand the thought process).

The power optimization units also appear to be some sort of power factor correctors. Essentially, the building as a whole is at the limits of capacity of the 11000 -> 415 V transformers supplying it, and there is no real scope to upgrade the wiring capacity without replacing all the high voltage equipment. The optimization units are intended to improve the power factor, so that more Watts can be used, without going over the transformer's Volt.Amp rating.
 
Last edited:
Got a bit more info:
.
.
.
The power optimization units also appear to be some sort of power factor correctors. Essentially, the building as a whole is at the limits of capacity of the 11000 -> 415 V transformers supplying it, and there is no real scope to upgrade the wiring capacity without replacing all the high voltage equipment. The optimization units are intended to improve the power factor, so that more Watts can be used, without going over the transformer's Volt.Amp rating.
I was guessing about the PF correction ... apparent versus real power. Power meters cannot tell the difference between apparent & real power.

I think this is 1 of the better threads that I have read in quite awhile. It may speak to the ego of a manager believing that they know that they know that have enough information to make a competent decision.

Alternatively, if I had been the 1 reporting to that manager, I would take upon myself to ensure that he actually did have enough info to make a competent decision.

In what industry do you work?
 
Wow. What fun. We have 2 major inline systems. The one at the primary site got the grid special status. It has enough power to keep most of the grid outside of our fence up and has done so. But, you can drop a screw driver into just the right place... Well, not anymore. Sort of the secret thermal exhaust port below the main port. But now it is ray shielded. 😀

Once a month at both sites, you just do not want to be near those buildings. It gets really loud and they do a switchover test. The batteries are frigging huge.
 
Sounds like the normal things that management does... For instance they decided they wanted trees outside the building instead of a generator that could handle the projected load of the server room. So we got a generator that could only handle 1/4 of load. It was great when the first major power outage that lasted more than 10 minutes happened and everything started shutting down and we pulled out the we told you so emails when the building was being setup.
 
People very rarely do a good enough job when it comes to power requirements for the systems they are employing. It's pretty bad though when people don't think enough to check breakers and all that after coming back up like that.

Even if a power survey was done, it doesn't sound like it took into account peak draw when those UPS go to charge their batteries. Though you could have mitigated that by safely shutting down your systems and bringing the UPS up one at a time.
 
The management sucks! This type of work should have Never been planned during a work day.

The 7 P's Rule certainly apply here.
Proper
Previous
Planning
Prevents
Piss
Poor
Performance
 
I've found out, in detail, what the 'power optimization units' actually are.

I was partly right, in that they do offer a degree of power factor correction.

However, the main benefit is 'voltage optimization'. Apparently, industrial supplies tend to be commissioned so that their voltage is towards the high-end of the acceptable range. E.g. a 230V (-6% +10% range supply) would be typically configured to provide about 250-251 V under 'base load'. This provides a good reserve of voltage for motor starting, and other heavy equipment, and offers a degree of protection against brownouts.

The 'power optimizer' is basically a step down transformer that steps down from 250 V to 220 V (just above the lower limit of acceptable voltage). By reducing the voltage, the power consumption of lights and electric motors is reduced - with the side benefit of making them run cooler and last longer.

The optimizer does have 2 tweaks.
1 - it has a special auxillary winding on the transformer which corrects voltage waveform distortion caused by electronic devices. (Non APFC electronic devices tend to take
current in short bursts at the top of the sine wave, this causes the voltage to sag at the top of the sine wave - giving the wave a flat top). For technical bods, this is a 'tertiary delta winding' which acts as a '3rd harmonic trap'
2 - it has a 2nd auxillary winding which helps ensure that the voltages on the 3 phases are balanced. 3 phase devices such as motors, and industrial electronics - require the voltage on each of the 3 phases in the supply to be closely matched. If one phase or all phases are at different voltages, this causes problems with motors and 3 phase electronic equipment and makes them operate with reduced efficiency and reduced power factor.
 
Back
Top