Best time to do network maintenance?

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
This question has been a sore point with me. I can basically see two schools of thought as to what the best time is:

1. Weekend- The benefits to doing maintenance during the weekend (Friday after business hours through Monday before business hours) is that it will not affect customers. The downside is that if a problem shows up hours after the maintenance was performed, there will be few people there to investigate and fix the problem.

1. Before or during business hours during the week- The benefit would be that the people that are making the changes will be around for a few hours to make sure everything is running smoothly. If something goes kaka 2 hours after the changes were made, the proper people will be sober (hopefully) and in the building. The major downside to this is, of course, more of an impact on the customers. Cutting several hours of service (or atleast potentially cut off some service) out of their day would be a pain in the rear.

I prefer the second option. Yes, the customer has to put up with maintenance during the time they may be trying to use the service, but if something goes wrong, a fix would come much quicker than if this was happening on a weekend. Of course, with proper testing potential problems would be cut down dramatically, but how many companies actually do testing before messing with their network in ways that could shut it down :)/ ).

Of course, in a 24/7 facility you could give the people in the trenches the ability to fix these problems, or atleast give them enough tools to better troubleshoot the problems while the IT guy is on his way in, but that could be expensive and cause other problems if the weekend shift guy doesnt know what he is doing (hopefully not the case if training is provided).

So enough of my rant padded question, what do you all think?
 

ScottMac

Moderator<br>Networking<br>Elite member
Mar 19, 2001
5,471
2
0
Good question n0c!

What my company encourages its customers to do, since most of their major network stuff happens after hours/on weekends, is to give us some notification and lead time for the pending cutover/change/whatever.

We arrange to have senior people available (either physically in the support center, or on-call), frequently have field people standing by (in case extra hands are needed), in some cases we contact the vendor to make sure they have people available (senior support and/or field), and notify the parts dispatch group, so they can be standing by.

So, maybe review whatever support contracts you maintain, and (if needed) upgrade them or change them to some kind of similar program.

Many of the customers I have worked with won't so much as turn a screw without a couple contingency plans in their pocket for as many "What if" situations as they can think of, with fallback plans.

Even a well-planned upgrade/update/major change can have a few surprises. People frequently overlook stuff like "Does the new component use more power than the old component / Do we have enough surplus in the system to make up for the difference and still have some safety margin?" "Do we have enough spare media (cables, converters, adapters), rack screws, power recepticles, etc" and "Is the code on the new equipment compatible with the code running on the remaining existing equipment?"

Some places run a practice drill to make sure everything has been thought of ahead of time and run a smaller subnet as a pilot for a while ahead of time, some do a full project management workup.

I'm of the belief that this stuff is always gonna happen in the middle of the night, starting Friday night. Giveing your (vendor/supplier/VAR) support organization a heads-up is probably the most overlooked thing. If they're not willing to have a few extra warm bodies available, find another place ... I don't think that level of service is unusual in the industry these days.

FWIW

Scott



 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Im not a part of the planning and whatnot of maintenance, I just get to put up with a lot of the problems that happen because of it ;)

I was just curious as to what other people think about this subject, in case I get a say about these things in the future ;)

Thats definitely a lot of planning you put into it, I dont think we have done anything major enough to warrant all of that, but I know of an upgrade or two in the future that some of those things might be useful... Thanks :)
 

Mucman

Diamond Member
Oct 10, 1999
7,246
1
0
Great question n0c! I am very passionate about this subject because I think the way we do it where I work is terrible!

I have finally convinced him to dedicate a couple hours in our TOS to mention that between 2-4am on Friday nights network and server upgrades
will be taking place. I would love to write a client notification system but he doesn't seem the necesity for it... Personnally I would love to send out an E-mail 2 a few hours before some work is getting done... at least that way our clients know we are busy bees :)
 

Nutz

Senior member
Sep 3, 2000
302
0
0
FWIW, most of our maintenance occurs on the weekends--Because we have people working 24/7. If something goes "kaka" we'll know about it relatively quickly. I know it doesn't exactly answer your question, but it kinda goes to show that you need to plan around how your network is used. If nobody worked on the weekend I think every admin would jump on doing the patches and infrastructure maintenance on the weekends. What they don't see is that if they did make a mistake it could be DAYS until the users come back to work to a potentially dead network. I'd hate to be the admin that has to answer to the boss. I can see it now:

Boss: "You had 3 days to make the changes and test it all out. How did this happen?"
Tech: "But we thought is was installed and ready to go. How were we to know it was broke? There weren't any users complaining or outages reported...."

See where I'm going with this?
 

ScottMac

Moderator<br>Networking<br>Elite member
Mar 19, 2001
5,471
2
0
Contingincy planning is vital.

A customer of mine was upgrading his system some years ago. The company was replacing all of their CableTron equipment with 3COM 1000 (workgroup) and 3000 (concentrator) units. After all of the connections were made, the team went to every desk to make sure they had connectivity. Everything was looking great by Sunday afternoon, everybody had connection to the servers, the Internet, and all the other offices, as appropriate.

Monday morning came, all the staff came in and lit up their machines. Then the calls started coming in.....most people couldn't get through to anything.

It seems that 3COM 3000s, by default, had "flow control" enabled (used to throttle individual workstations). Since the 3000s were aggregating and the traffic flow was well above the threshold, they started "flow controlling" the traffic from the 1000s (down to almost nothing).

After a few minutes of panic, the staff moved all the connections back to the 'Trons, and the calls to support were made to figure out the problem. Doug/L3Guy was the first to nail it (he's hard core 3COM), beating the 3COM tech support by HOURS. The units were re-configured, and the next weekend, the move was made again, this time with no panic on Monday morning.

I s'pose the moral of the story is that even when things look good at the end of the project, there's still some things that can come back and bite you in the butt, and it's probably a good idea to have the staff (and support) standing by for "going live" at full load.

When it's all over, assuming there's no major explosion, some flavor of "Well Done" for the participants is probably a good idea too. A little apprecialtion goes a long way (especially if it's deserved).

FWIW

Scott
 

Tallgeese

Diamond Member
Feb 26, 2001
5,775
1
0
As far as time goes: usually toward the end of a day for application work, weekend or afterhours for infrastructure (servers, networking)
The reasoning: end of day still leaves a couple of users to check the app functions. we don't need end-users to help test functionality on infra stuff.

Still, no matter what time it occurs, my approach has always been:

Research, Plan, Plan Some More, Research Some More, Test Run, More Planning, A Bit More Research, Upgrade/Conversion

Sometimes, tho, the amount of pre-work that is warranted is a relative thing.

If the app in question is a core line-of-business application that affects 95% of the company when it's unavailable, then it deserves and warrants more attention than a single application that affects a handful of users...unless, of course, that handful is, say Payroll ;)

Quite frankly, the most important thing I drill into my own staff's heads is READING READING READING. We scour every tech note and release note available to us and highlight it for issues relevant to our particular infrastructure and setup.

Cannot even BEGIN to count how many times that has gotten me home with my family, rather than pulling my hair out at work all night long.
 

Tallgeese

Diamond Member
Feb 26, 2001
5,775
1
0
Originally posted by: ScottMac
Doug/L3Guy was the first to nail it (he's hard core 3COM), beating the 3COM tech support by HOURS.
True dat....

L3Guy = BADASS
 

Tallgeese

Diamond Member
Feb 26, 2001
5,775
1
0
Originally posted by: ScottMac
it's probably a good idea to have the staff (and support) standing by for "going live" at full load.
We always include a day or two "cool-off" after major changes where we're standing by for something unexpected to show up. Much better for them to have you available and not need you, than for them to need you and not have you available.

We also suggest our clients insist on the same stand-by time from any other 3rd party vendor they work with.
 

Mucman

Diamond Member
Oct 10, 1999
7,246
1
0
Sounds like you guys have different environments! When it comes to the office LAN, I can pick it apart whenever I want since there are only 4 of us :). As a web-host, there is no decent time to reboot a machine (as I found out last week during a bunch of patches). At 2am in the morning I got a page on my cell!

 

Santa

Golden Member
Oct 11, 1999
1,168
0
0
There is no right answer but obviously planning and having a contingency plan for backup is a definatly must.

We surely test all patches and upgrades hardware/software and last but not least make sure your there early the next day to see if something didn't go through as previously tested because no matter how much you can tell them you tested it after the install users will not belieave you if there is a problem. Just try and collect yourself during the problem and instate your contingency plan then solve the problem.

Along with planning on how to make the changes you should make a documentation of what systems will be affected and how long they can remain down.
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: TallGeese
As far as time goes: usually toward the end of a day for application work, weekend or afterhours for infrastructure (servers, networking)
The reasoning: end of day still leaves a couple of users to check the app functions. we don't need end-users to help test functionality on infra stuff.

The problem I see with this is that many people would be in a rush to get out. End of the day on Friday people are wanting to leave and go out with friends or whatever (hardcore drinking for most of the admins I know :p). They rush with what they are doing, leave immediately afterwards and are "unavailable" the rest of the night. Thats a consideration Ive seen mentioned in plenty of books about doing changes during those time periods. Ill try and dig up something on it tonight. (Steven Northcutt has written some good stuff)

I love the ideas you all have about this though, its some interresting stuff! Definitely making me rethink my original opinion on the subject. Thanks :)

 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: Mucman
Great question n0c! I am very passionate about this subject because I think the way we do it where I work is terrible!

I have finally convinced him to dedicate a couple hours in our TOS to mention that between 2-4am on Friday nights network and server upgrades
will be taking place. I would love to write a client notification system but he doesn't seem the necesity for it... Personnally I would love to send out an E-mail 2 a few hours before some work is getting done... at least that way our clients know we are busy bees :)

We have a scheduled time for maintenance that our customers know about. We also email all of our customers a day or two in advance to remind them. Put something on your site to let them know when scheduled maintenance is atleast. A mailing list might not be a bad idea either, shouldnt be too tough.
 

mcveigh

Diamond Member
Dec 20, 2000
6,457
6
81
Since I am the network admin for several small businesses, I can pretty much walk in and tell them the network has to go down NOW!...but I don't do that.;)

A friends company is on call 24/7 and that is sometimes hard to get things done because I never know if they will be in the middle of a flight at midnight on fridays (they are an air ambulance company). Other places don't want to wait after hours for me do get done and don't want to give me a key, so I have to work during business hours.

All in all i like doing the early moring thing, but I'm not a morning person:|,
I can find out quickly if their problems with my changes.:cool: