Downtime

Downtime

Today, the npw servers were down again between 11:20 and 18:40. Just like some months ago and somehow because of the same reason (i.e. a system update and a reboot but no kernel update), the VM host didn’t boot any more. It’s a Debian Lenny and Linux 2.6.26 with some packages taken from Testing.

After rebooting the system this time, all disks seemed to be vanished and hence mdadm was unable to assemble the raid volumes. Even manually re-inserting the relevant block device modules didn’t solve this problem. Booting a Linux 2.6.32 kernel did the trick but network connectivity was gone.

Quite a few months ago, a new NIC was installed as the old one supposedly caused heavy package loss. An Intel-based card did a quite good in the meantime and the system seemed to be almost hiccup-free. This time, however, the new card caused incompatibilities or instabilities or just didn’t work. So we switched back to the old one. Problem solved, system booted, able to ping. Packet loss. That problem vanished shortly after, though.

At the moment, everything seems to be up and running again with zero packet loss. For now.