Ease the pain of network downtime by managing expectations

Avoid embarrassment by checking the obvious
The first server I ever administered was a Dell running NT4 Server. It was working well when I was asked to move it from the main office to a secure area. At first, the job seemed fairly simple.

There was no network point in the archive room where it was to live, so the first thing I had to do was arrange one. This entailed simply drilling a hole through a partition wall and running a cable through it. Having ensured that my new point was live by plugging a desktop system into it, I arranged with the rest of the company the best time to move the server.

As it turned out, the entire company was in a meeting discussing their next research project, so I had free run of the building and the network. They were having lunch sent in, so I figured I had plenty of time to allow for disasters. I unplugged the server screen and moved it to the new area.

I piled the keyboard, mouse, and UPS onto the server case, unlocked the wheels, and removed the power plug. The server was still active, running from the UPS, which was set to run the system for 20 minutes before closing down. In no time, everything was plugged back in, and I ran around the office to make sure that I could see server volumes on the desktop machines.

"Great", I thought, "a job well done". When the rest of the workforce returned, however, it turned out that the messaging services had stalled, and nobody could send email. I took a few more minutes to restart the services and kicked myself for not thinking to check it.

With any luck, that particular scenario will not occur again, since I recorded it in my server diary and added it to the procedure for similar operations.

Have a contingency plan
It's possible that your work may overrun the time constraints allotted, and that you'll need to retreat from your efforts to let users back on. Be sure that it's possible to roll back to the original system state. But, before you abandon a job that's nearly complete, try some on-the-fly negotiations with the user base. They may be happy to stay offline for another hour if it prevents another shutdown in the near future.

It's important to know what the options are and what the point of no return is. Thankfully, I've never gone past it. By employing a strict if-it-ain't-broke-don't-fix-it policy, I've managed to keep things reasonably functional. Any work I wasn't sure about went onto the test machines for evaluation before I implemented it, and I also created an additional backup. In a pinch, it would have been possible to plug my test server onto the live network to replace the main machine.

In any event, you should build a margin of safety into any scheduled downtime slot. If you come in ahead of schedule, your team will feel good about it, and the user base will also think you have done well. If your estimates are too "realistic", you will have to live up to them or risk losing the confidence of both your team and the users.

Post your comment

In order to post a comment you need to be registered and logged in

Log in or create your ZDNet UK account below

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Community FAQ

ZDNet UK Live

Jack Schofield

@apexwm >> "They can save maybe up to 1% of their IT costs" > I'd like to know how you propose this number? MS Office costs hundreds > per copy,...

1 minute ago by Jack Schofield on Late starters to Windows 7 migration may find it more costly, says Gartner
Jack Schofield

@apexwm > I would be curious to know what exactly they mean by "mini-notebooks are > less-than-perfect substitutes for standard low-end laptops"....

26 minutes ago by Jack Schofield on While PC shipments will grow to a million per day, netbooks are in decline
superglaze

Digital Britain author attacks the government for delaying the 2Mbps universal service commitment http://bit.ly/ciAS2s

LarsTS

Researchers at Norwegian and German institutes claim to have successfully cracked quantum cryptography equipment http://bit.ly/bfQQRt

benrothke

Quantum crypto detectors cracked by researchers http://tinyurl.com/32orrr8 @schneierblog - your thoughts?

dominic_victor

Suse Linux Enterprise Server for VMware ships: By Jack Clark, ZDNet UK, 2 September, 2010 17:11 VMware and Novell ... http://bit.ly/bL9BMy

Bhackett10

RT @ZDNetUK_News: Dell abandons battle to buy 3Par: HP has won the short, sharp race to add the data storage management company to i... http://bit.ly/aLg1tA

ZDNetUK_News

Suse Linux Enterprise Server for VMware ships: Businesses that buy vSphere licences will get SLES free of charge, ... http://bit.ly/adlav5

superglaze

Dell abandons battle to buy 3Par http://bit.ly/920Spv

qbspchelp

RT @ZDNetUK_News: iOS 4.2 available for iPad in November: The operating system update will allow wireless printing and audio and vid... http://bit.ly/azstPx

superglaze

@gruber @daringfireball It's here, but will it get used? Universal wireless charger standard gets public release http://bit.ly/doJO2u

ZDNetUK_News

Universal wireless charger standard gets public release http://bit.ly/cCdlZv

IP_v6

#IPv6 repost RT @pixeladdikt: RT @RIPE_NCC: ~"IPv6 news: using #IPv6 to connect everything http://bit.ly/dtJvh3 " ... http://bit.ly/aRkCNT

paulallen77

Windows Phone 7 released to manufacturers http://bit.ly/addml7

ImGoneBuzzirk

Windows Phone 7 released to manufacturers http://bit.ly/b9oigT

trejrco

RT @pixeladdikt: RT @RIPE_NCC: ~"IPv6 news: using #IPv6 to connect everything http://bit.ly/dtJvh3 " +ArchRock :)

Droid_Phone

Carter attacks coalition over 2Mbps delay http://bit.ly/aPTmax | #Droid #Android

Droid_Phone

Windows Phone 7 released to manufacturers http://bit.ly/9rL0sc | #Droid #Android

First Take

Tony - on the 28th, Hotmail EAS on iPhone didn't work because it wasn't publicly available then. Ignore the email, which was part of the internal...

6 hours ago by First Take on Hotmail Exchange ActiveSync
BrenoVale

RT @RIPE_NCC: Exciting IPv6 news: using #IPv6 to connect everything from people's homes to the smart grid http://bit.ly/dtJvh3 (by @mlamonica)

Featured white papers

The benefits of email archiving

Email archiving lowers the risk of being unable to find important documents and help in achieving regulatory compliance and answering litigation requests.

Download now

Cloud Computing - What does it really mean?

Technology transforming business - The term cloud is used as a metaphor for the Internet, based on how theInternet is depicted..

Download now

Out-of-box Comparison Between Dell, HP and IBM blade servers

This compelling paper by Principled Technologies compares out-of-box experiences on Dell PowerEdge M600 Blade System, HP BladeSystem..

Download now