A night from hell in the ER last night

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

TraumaRN

Diamond Member
Jun 5, 2005
6,893
63
91
Originally posted by: iamwiz82
Originally posted by: BrunoPuntzJones
Originally posted by: Xanis
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their fucking job. Sounds like you have a bunch of fuck ups in IT if you ask me. Im a scientist so this may mean nothing.

You're generalizing. Not all IT people suck. It seems like the problem the OP's hospital had was all due to poor and ill-considered design.

Probably not the case, but it could be due to cost constraints at the top.

Some yahoo could have thought it crazy to authorize a million bucks for a back-up system, and IT had to make due with what they were given.

Story of my life. Whenever someone bitches about things like that I say "I can do it, bring money!" If management doesn't approve the cost, it does not get done until it catastrophically fails, and then it's implemented the next week. ;)

I'm quite sure there will be some e-mail about it about some redundancy being placed now. I think it's not so much a question of money as it was speed of implementing it. They wanted in place NOW and I think despite the system being up and running at my hospital for 18 months now as still an issue of rushing initially and getting bit in the ass now.

I mean look they implemented this system in just under 15 months over 9 hospitals and hundreds of satellite clinics etc. And I think in all the speed some quality/redundancy was overlooked.

Like I said I know the server room has it's own backup generator so something went wrong either with those or with the UPS or something internal to the server room
 

homercles337

Diamond Member
Dec 29, 2004
6,340
3
71
Originally posted by: middlehead
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their fucking job. Sounds like you have a bunch of fuck ups in IT if you ask me. Im a scientist so this may mean nothing.
Judging from this quote, all HICKS from hickville, ND are assholes.

Fixed for joo.
 

homercles337

Diamond Member
Dec 29, 2004
6,340
3
71
Originally posted by: Xanis
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their fucking job. Sounds like you have a bunch of fuck ups in IT if you ask me. Im a scientist so this may mean nothing.

You're generalizing. Not all IT people suck. It seems like the problem the OP's hospital had was all due to poor and ill-considered design.

Yes, i was generalizing, thanks Captain Obvious. Sorry to step on your ego. :roll:

Edit: FTR, 0.1% of IT people arent complete retards with no sense, education, or compassion.
 

dman

Diamond Member
Nov 2, 1999
9,110
0
76
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their job.

If it was one server that went down and kept going down then I could say you have a problem in IT Staffing but for all those servers to go down for 6hrs, well, I can guarantee you somebody outside of IT (in finance) knew of the risk and decided that the cost was (at the time) too high to build in the necessary redundancy.

Now that the problem has happened, I imagine they'll be reviewing the cause and that decision and perhaps they will invest in a fix or perhaps they'll write it off as a fluke and hope it doesn't happen again. Either way, it WON'T be the IT dept making THAT decision. They'll just be documenting the Root Cause and what $$$ it takes to fix.

Further, going back to the first point about one server... If you have sucky IT Staff that can't manage an environment it's probably because the budget is deemed too low for the necessary skills.

But, go ahead and blame the IT department, they're used to it.





 
Nov 3, 2004
10,491
22
81
Originally posted by: middlehead
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their fucking job. Sounds like you have a bunch of fuck ups in IT if you ask me. Im a scientist so this may mean nothing.
Judging from this quote, all scientists are assholes.

well, he is an asshole
 

middlehead

Diamond Member
Jul 11, 2004
4,573
2
81
Originally posted by: IAteYourMother
Originally posted by: middlehead
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their fucking job. Sounds like you have a bunch of fuck ups in IT if you ask me. Im a scientist so this may mean nothing.
Judging from this quote, all scientists are assholes.
well, he is an asshole
Man Law!
 

middlehead

Diamond Member
Jul 11, 2004
4,573
2
81
Originally posted by: homercles337
Originally posted by: middlehead
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their fucking job. Sounds like you have a bunch of fuck ups in IT if you ask me. Im a scientist so this may mean nothing.
Judging from this quote, all HICKS from hickville, ND are assholes.
Fixed for joo.
Hickville's on the west end of the state, I'm on the east end. Fail.
 

JDMnAR1

Lifer
May 12, 2003
11,984
1
0
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their fucking job. Sounds like you have a bunch of fuck ups in IT if you ask me. Im a douchebag so this may mean nothing.

Nice generalization there buddy. For the record, in my organization (also a 'Most Wired' hospital for four or more years) IT does not install backup generators, UPS units, cooling, or any of the other vital support services of a modern datacenter. That typically falls to electricians, plumbers, and other physical plant types. If the failure wasn't with the systems themselves, it was in all likelihood not IT at fault. We have had two somewhat major failures in our datacenter in the past two years - one with cooling and one with power - neither of which IT could have prevented. With no water (a water main ruptured) chilled water coolers don't function very well, and it wasn't like we could go dig a well. Sure, we could have had supplemental cooling, and in fact had been asking for it for several years, but it didn't get approved (by hospital administration, not IT) and installed until after that failure. Our power issues were related to wiring, breaker panels, and transfer switches, none of which IT has any control over. Since the problem was at the distribution panel, there was no way that generator power could reach the systems.

When IT controls the purse strings in an organization, then you can reasonably place the blame on IT. Until then, the person who signs the checks has to assume some responsibility.
 

NuroMancer

Golden Member
Nov 8, 2004
1,684
1
76
This sounds really odd to me.

If the hospital didn't lose power, how could a datacenter of that size lose power unless it was incredibly stupidly designed?

All blades are capible of redundant power for relativly cheap, so even if a breaker panel went, the other power panel + power connection should just be able to take over.

It sounds like some serious shotcuts were taken.
 

nweaver

Diamond Member
Jan 21, 2001
6,813
1
0
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their fucking job. Sounds like you have a bunch of fuck ups in IT if you ask me. Im a scientist so this may mean nothing.

I blame the consultant.
 

nweaver

Diamond Member
Jan 21, 2001
6,813
1
0
Originally posted by: NuroMancer
This sounds really odd to me.

If the hospital didn't lose power, how could a datacenter of that size lose power unless it was incredibly stupidly designed?

All blades are capible of redundant power for relativly cheap, so even if a breaker panel went, the other power panel + power connection should just be able to take over.

It sounds like some serious shotcuts were taken.

yep.

In an ideal situation, every server would have 2 power sources, that come from 2 different (physically) power panels/circuits, and if possible, from 2 providers. Same for network, storage, etc. Sounds like maybe it all went into 1 panel, and that panel had a massive failure, or there was one major drop from the Gen/power company that went TU or something. I'll bet IT asked for the best, and got the cheapest. Remember, finance is going to go to the lowest bidder, not the best provider.
 

slag

Lifer
Dec 14, 2000
10,473
81
101
I find it funny that the majority of people here probably haven't set foot in a datacenter, let alone had a role in helping plan, engineer, or implement a redundant datacenter/RTSC/or even a lab for hvac/network/power and yet still know everything there is about everything.

A job I worked at had a major power failure one night during a systems test. Seems the circuit on the backup grid blew during failover. This circuit had been tested by an engineering firm and signed off as ok before putting into use and it still failed. I don't remember all the specifics, but all T's were crossed, all I's dotted, and still everything didn't work as planned.

My point is, while you depend on your equipment to be ready to work, when it doesn't, sometimes its no ones fault.

Everything fails eventually and sometimes shit just happens.
 

NuroMancer

Golden Member
Nov 8, 2004
1,684
1
76
Originally posted by: slag
I find it funny that the majority of people here probably haven't set foot in a datacenter, let alone had a role in helping plan, engineer, or implement a redundant datacenter/RTSC/or even a lab for hvac/network/power and yet still know everything there is about everything.

A job I worked at had a major power failure one night during a systems test. Seems the circuit on the backup grid blew during failover. This circuit had been tested by an engineering firm and signed off as ok before putting into use and it still failed. I don't remember all the specifics, but all T's were crossed, all I's dotted, and still everything didn't work as planned.

My point is, while you depend on your equipment to be ready to work, when it doesn't, sometimes its no ones fault.

Everything fails eventually and sometimes shit just happens.

Thats why you build redundancy, and you test it...

That way when shit blows up, it is no ones fault, and the DC still doesn't go down.
 

pravi333

Senior member
May 25, 2005
577
0
0
this happened on my previous project. A raccoon bit the town's power cables & the whole town lost power. UPS kicked in as a transition system and supposed to pass it on to the generator but the fuse in the ups blew & the whole data center went hard.
After 4 hours they gave us generator power & didnt go to street power due to a bad weather that was coming in shortly. After 8 hours we got the whole data center up & running. 12 hours after this the filter in the diesel generator was going bad so we had to shutdown the whole data center, change the filter & bring back up the whole data center.
After 3 days the fuse was fixed & they transitioned to street power without any interrupt. Shit happens!
 

pravi333

Senior member
May 25, 2005
577
0
0
redundancy is a business decision. Trust me, every system admins want a HA'd server, it saves there work too, but in the end if the business allocates the fund we're more than happy to implement it.
IT is just a support for most of the businesses, without IT it might hurt there business but its not a ultimate show stopper.
 

slag

Lifer
Dec 14, 2000
10,473
81
101
Originally posted by: NuroMancer
Originally posted by: slag
I find it funny that the majority of people here probably haven't set foot in a datacenter, let alone had a role in helping plan, engineer, or implement a redundant datacenter/RTSC/or even a lab for hvac/network/power and yet still know everything there is about everything.

A job I worked at had a major power failure one night during a systems test. Seems the circuit on the backup grid blew during failover. This circuit had been tested by an engineering firm and signed off as ok before putting into use and it still failed. I don't remember all the specifics, but all T's were crossed, all I's dotted, and still everything didn't work as planned.

My point is, while you depend on your equipment to be ready to work, when it doesn't, sometimes its no ones fault.

Everything fails eventually and sometimes shit just happens.

Thats why you build redundancy, and you test it...

That way when shit blows up, it is no ones fault, and the DC still doesn't go down.

Again, my point is, we could have 10 or even 100 successful tests and the one time its needed to work, something can fail. Everything mechanical fails eventually.

 

NuroMancer

Golden Member
Nov 8, 2004
1,684
1
76
Originally posted by: slag
Originally posted by: NuroMancer
Originally posted by: slag
I find it funny that the majority of people here probably haven't set foot in a datacenter, let alone had a role in helping plan, engineer, or implement a redundant datacenter/RTSC/or even a lab for hvac/network/power and yet still know everything there is about everything.

A job I worked at had a major power failure one night during a systems test. Seems the circuit on the backup grid blew during failover. This circuit had been tested by an engineering firm and signed off as ok before putting into use and it still failed. I don't remember all the specifics, but all T's were crossed, all I's dotted, and still everything didn't work as planned.

My point is, while you depend on your equipment to be ready to work, when it doesn't, sometimes its no ones fault.

Everything fails eventually and sometimes shit just happens.

Thats why you build redundancy, and you test it...

That way when shit blows up, it is no ones fault, and the DC still doesn't go down.

Again, my point is, we could have 10 or even 100 successful tests and the one time its needed to work, something can fail. Everything mechanical fails eventually.

Your right, I agree, my point is you are calling people out over criticizing the design of the DC which apparently, probally due to buget, lacked redundancy.

The reason for dual redudant or greater systems is simple, you reduce the chance that when you need it, that it will fail.
 

RadiclDreamer

Diamond Member
Aug 8, 2004
8,622
40
91
Originally posted by: skyking
That was an unforseen and totally unacceptable failure mode for a hospital, IMO. Someone may get fired, and huge changes will be made because of that.

I work in a VERY wired hospital as well and we were considered for the above mentioned award and I can tell you noone will lose their job because of it. Things happen, we take as many precautions and have as much redundancy as possible but there are always going to be issues. Also, 6 hours is a very good job getting everything back up, many of the products we use dont start all the services automatically so each box has to be logged in, in order and then have the services started.

I just wish my users were as appreciative as this guy when it comes to unexpected downtime. Hell we can give a months notice and they still complain
 

RadiclDreamer

Diamond Member
Aug 8, 2004
8,622
40
91
Originally posted by: 2Xtreme21
What would cause that all to fail like that? Someone didn't place correct safeguards in, if you ask me.

Water sensors in the floor have done it for us, they shutdown the ups and take all the servers with them as a safeguard
 

RadiclDreamer

Diamond Member
Aug 8, 2004
8,622
40
91
Originally posted by: homercles337
Look, IT people SUCK. You would not have had a "night of hell" if IT did their fucking job. Sounds like you have a bunch of fuck ups in IT if you ask me. Im a scientist so this may mean nothing.

You are clueless, shut up. Things happen. Budget and manpower can only get you so much reliability

Give us unlimited budget and unlimited manpower and then I can assure you of uptime, but until that fantasy world materializes we do what we can with what we have.
 

ViviTheMage

Lifer
Dec 12, 2002
36,189
87
91
madgenius.com
that is interesting you guys lost power for 6 hours ?

I only work at an arboretum with 6 servers, 100 + computers and we have a generator that can power a city for a few hours (10k + people or so)....it cost us over a few million, but still. if WE have it, a hospital better have something just as good if not better.

Plus it runs on diesel and is the size of 4 semis :D
 

yoda291

Diamond Member
Aug 11, 2001
5,079
0
0
here's something to crunch around....when people are told power goes out to a floor, people call facilities. When people are told power goes out to the server room people call IT.
 

preslove

Lifer
Sep 10, 2003
16,754
64
91
Do us all a favor and tell us what hospital you work at so we can avoid it. Not that you're not a great nurse, but your bosses must be stupid :D