It’s been an interesting 24 hours.
Monday
8:00 PM
I get home after working a regular day and having supper with a friend.
8:15 PM
One of our friend from church — who happens to live a few blocks from my workplace — calls. “Did you know your workplace is on fire?” “Uhh, no….” “Yeah, big fire, flames, dozens of emergency vehicles…” “Hmm…”
So then I call my manager… “Did you know the company is on fire?” “Uhh… no….” “Yeah, sounds bad, I’m going to head in and see what’s going on.”
8:30 PM
I arrive at work. Cops have blocked all the entrances. I identify myself as an I.S. manager and am directed to the “manager area”. (Others are kept across the street.)
An eerie orange glow — and lots of smoke — is visible above the rooftop. This looks bad.
We learn that something in the compressor room caught fire or exploded. Fire spread upward quickly.
Most importantly, everyone got out without any injuries.
8:35 PM
Our T1 provider calls my cellphone. These are really, really good folks. I answer, as I’m standing next to a noisy fire truck.
ISP: Hi. Just calling to let you know that your T1 is down and we’re looking into it.
Me: Hmm. That might be because it’s on fire.
ISP: Uhm, did you just say it’s on fire?
Me: Could be. There’s a fire in the building and we’re not allowed in just yet. Plus the electric company is trying to cut the power.
ISP: OK then, I think we’re going to assume the problem is at your end for now.
This sounds bad. I know the I.S. dept. still has UPS power, since the phone system attendant is answering. But if the T1 is down, it seems that it must have burned up (its run goes very close to the affected area), and taken some phone lines with it.
9 PM
It’s below freezing already, and I didn’t bundle up very well. Several waiting — but unused — ambulances are nearby and the paramedics invite those of us waiting inside the heated vehicles.
9:30 PM
The CEO and CFO are permitted to tour the building with a firefighter. Word is that sprinklers have come on and there’s water in the I.S. dept. This is potentially really bad news. The I.S. dept is in the basement and water has been a problem before, so we’re prepared, but water and computers are never a good combination…
After a little while, we are allowed inside as well. There is 1 to 2 inches of water on the floor in the basement, but fortunately nothing has reached the raised floor yet. The fire department supplies an AC generator to power the sump pump, plus 6 squeeges. We supply 6 people. We keep on the squeege duty until about midnight.
Meanwhile, it turns out that a water pipe broke out in the shop near the fire. Water gets shut off. But the floor up there appears to unfortunately slope towards the staircase that leads to the basement. Water is coming down fast. We find what we can to block the path, and that seemed to just make the collection deeper.
I periodically try to find someone that can bring a pump and a generator — or something — to help this get cleared up.
Midnight
I head home. The water situation is improving, but not resolved yet. Others remain.
Once home, I post a notice on our public website saying that our power, Internet, phone, and FAX lines are down. You just might have a problem reaching us.
I also e-mail some consultants, saying that the reason the VPN doesn’t work is that we had a fire. I also e-mail the new I.S. person who has an hour commute and advise him of the situation.
Our fire was top story on one of the local newscasts. We watch the video of it, sigh at the annoying “on the scene” reporter (motto: “things were interesting an hour ago”), and turn in.
1AM
Time to sleep.
Tuesday
6AM
Get up, in to work by 6:45.
Everything is dead. No power, no water, no phones, no heat, no gas. Also, fortunately, no more water coming down.
I call the other I.S. developer and tell him not to bother coming in right now, but maybe we’ll call later.
8:30 AM
An impromptu meeting happens. The CEO sets communications as the #1 priority: phone and basic Internet access. Plants 2 and 4 have working power, so if we can bring up some servers and network gear on a generator, computers over there will work.
No water in plant 1, where my office is.
So nearest restrooms are a 5-10 minute walk through dark and damp areas. Bottled water, coffee, and donuts appear at the halfway mark to the restrooms ;-)
9:00 AM
Generators start appearing near the I.S. dept. After getting them powered up, proper extension cords found, etc., we start plugging in the phone system.
Power strip 1 works.
I flick the switch on power strip 2, and it all goes dead. This generator’s circuit breaker keeps doing that every time we try to turn on the phone system.
So we try generator #2. Phone system finally powers up. But all incoming lines are ringing at phones in sales. Volunteers go up to answer those phones.
Meanwhile, phone tech shows up, and orders his own, larger generator.
I call our person that’s an hour away and tell him to go ahead and head in.
10:00 AM
Now we turn to the computers. We’re going to have to be careful what we bring up so we don’t overload the generator (and thus have an unclean shutdown on anything that’s already up.)
I recable power to some switches, our T1 router (ISP calls: the T1 just came up! yay!), and our firewall.
Flick the switch on firewall.
Generator stays up. Firewall emits a high-pitched whine. Eeep. Flick the switch again!
After checking things out, we adjust the voltage a bit and determine that things are generally OK. I run the firewall on one of its two power supplies. At least if it burns one out, we can still run it off the other later. Turn it back on — whines again, but it goes away after a few minutes. Phew.
Next step: our main Linux file server. Flick the switch, and there’s a satisfying “whoosh” from the fans. Excellent.
Our developer shows up and observes later: “John, this was odd. When I got here, there were 7 guys standing around watching you boot up machines.”
11:00 AM
Decision time. Our main ERP server is an AIX box. 8U. About a dozen disks. Certain to be a large draw. Can the generator handle it? Yes, it should.
Should we power it up, given that it takes 45 minutes to come up and 20 to go down? We opt to try.
I cable it up and press the power button.
Nothing happens.
But that is normal for AIX. After a few minutes, we start seeing hex codes on the LCD. A few minutes later, disks start coming up. All looks good…
11:15 AM
Orange alert lamp comes on for the AIX box. Uhoh.
Turns out one of the drives in the RAID is bad. Time to call IBM, but at least it’s mirrored to the box is up.
IBM service calls are always “fun” for this. I spent a total of about an hour on the phone with them, 90% of which was spent with them analyzing error logs trying to find out what exact part number to give is, and 10% spent trying to look up our account information.
Plus, the phone guy is now switching the phone system to his genset. He tries to hook it up through a home-type UPS, which is giving him fits. I get disconnected from IBM at least twice.
Noon
My supervisor goes to Subway and brings in lunch for all four of us in I.S. since we don’t really feel like we can leave.
Word arrives that the generator that’s powering the servers is leaking oil. Eeep. But it should be good until evening. On the plus side, the generator will automatically shut itself off when it runs out of oil. Won’t damage itself.
1PM
IBM calls, trying again to get information about what exact part they need to bring out. Only 29 minutes on the phone this time.
Word arrives that a 60kW generator has arrived that they will attempt to hook into the mains supply for the building. The main power distribution center has been burnt. Plus we aren’t allowed to touch large parts of it since fire inspectors and insurance adjustors aren’t done yet.
But no ETA on the big generator.
We make a preliminary decision to cut power to computers & the network at 5PM, and maintain the single phone system generator overnight.
2PM
Local IBM rep calls. He’ll be out between 3:30 and 4. I say that’ll be fine, but warn him that if he’s any later, the power may be down.
3:45PM
IBM rep shows up, replaces the dead disk. It works. Whee.
4:15PM
I start powering down the AIX box.
We then learn that ETA on the big generator is 5PM. So we start shutting down other servers as planned.
5PM
Big generator comes up. It works. We slowly bring up our systems. Everything is good.
BUT — we are basically maxing out the capacity on the generator. People will not be able to just come in and work like usual tomorrow.
We’ll be meeting in the morning to figure it all out.
6PM
We post signs at entrances warning people to not turn *anything* on.
Finally head home at 6:30.