"We thought we'd be ok," says Bob, "as the second aircon unit seemed to be holding-up on its own, so we thought we could survive the three days. Then, at 6.00pm, the building's main aircon shut off, increasing the load on our server room unit. The whole server room shut down again and, yet again, we got no text message from the monitoring system."
"When we came back in on Tuesday morning, the comms room was even hotter than on Monday but we managed to get a junior aircon engineer in." Now you'd think even a junior aircon engineer should be quite capable of dealing with a broken aircon unit, but again, life isn't that simple.
Because the server room only contained the heat exchanger, the engineer needed roof access to reach the main aircon unit. "The trouble was that nobody is allowed on the roof without an hour of safety instruction, a method statement from us, and 24 hours' notice. We clearly weren't going to get the broken aircon unit fixed that day," says Bob.
"At this point, we realised we had a major problem. What we thought were two aircon units running redundantly were actually required in parallel, but because nobody had switch them off since the server room was built seven years ago, nobody knew this."
So Bob hired a 6KWatt portable aircon unit and stuck it inside the server room, with a pipe taking the hot air out through the server room door -- a short-term fix at best. Aside from being an obvious security risk, the open door also ruined the insulation effects of the server room. Nevertheless, Bob hoped it would work.
It didn't. "On Wednesday morning we came in and the same thing had happened again; our comms room was down, and this time it was hotter still; the small aircon unit simply had not coped," says Bob.
"So we had a choice. We could either increase the shut-off point on the UPS, or we could switch off some of the servers. In some circumstances, servers will switch themselves off as the temperature rises, but once the room temperature gets to 45 degrees it's only going to keep rising. So we started switching off every server that we could survive without, and hired a bigger portable aircon unit."
After four days of crashes, this stabilised the server room, even if the door was now even wider open to accommodate the thicker tube blowing even more hot air out into the offices.
On Thursday, Bob finally managed to get a senior person from the aircon company in to have a look at the broken unit on the roof. He traced the problem to a seized pump, for which there was no chance of repair. But, in what appeared like a change of fortune, although this old model of aircon unit is no longer manufactured, the engineer somehow managed to locate one.
"We paid for it, and it was due to be delivered on the Friday, but when Friday came we got a call saying they had dropped it off the back of the lorry and cracked the pressure unit, which could not be repaired. We'd have to buy a new aircon unit instead." More paperwork, and more people on the roof.
Now the trouble with new aircon units - from our beleaguered manager's point of view - is that under EU regulations they have to use a new, eco-friendly coolant that they work at different pressures and therefore require thicker pipes. Bob's server room required 120 feet of pipes to channel coolant to and from the units on the roof.
Finally -- and we're half-way into the second week at this point -- Bob had a stroke of luck.







Talkback
Advocates of business grid computing should learn something from this story.