Life of a Computer Scientist: Upgrade v.s. False Alarm

I am having difficulty concluding the moral of a story I'm about to tell you, so please bear with me while I tell the story first.

I've recently started performing an upgrade of sshd on a multi-user server from original ssh 2.0.13, circa 1999, to the latest OpenSSH, 4.6p1 at the time of writing. The reason is complicated. The server is a Solaris 2.6 system that has been in use since before 1999, and it is already falling behind on security patches. The ssh is very old---protocol 2 had some bugs, and we always have to fall back using ssh protocol 1 instead, which has long been deprecated.

Several people and I have root access to that server, but nobody assumed the position of a system admin. I think that's a case of collaborative irresponsibility. The department has an appointed system admin who is supposed to do the job, but he has been reluctant in keeping that server up to date especially after the department's migration to Linux a few years ago.

Some time ago, I read about yet another telnet remote root exploit and decided to disable all insecure means of login to that particular server, which are telnet, rsh, rlogin, rexec, and ftp. About a month later, I got an e-mail from IT backup service telling me that backup server cannot login using rsh. My professor asserted that I should find a resolution because I was the one who disabled the insecure services. Tell me about how it feels to be rewarded with mounting responsibility for a moment of attentiveness.

I told IT to use ssh, but they insisted on using a ssh protocol 2 key, and that was the last straw that prompted me to take the initiative to upgrade sshd.

After installing the software and converted the host keys, I thought it would be nice to run the new sshd on a separate port for testing. It allowed other people to help me iron out a number of problems, like the infamous "Disconnecting: Corrupted check bytes on input" problem when using ssh protocol 1 from an older client (the solution is to make cipher-3des1.c and cipher-bf1.c include openssl-compat.h).

However, the very evening after I ran sshd for testing, the department system admin was alerted of a possible intrusion. Although I already told IT that I would upgrade sshd, another branch of IT---the intrusion response team---didn't know about it and thought it to be suspicious that sshd is being run on another port. This resulted in the server being taken offline for a few hours until I had a chance to intervene.

Well, what's the big deal? The original idea of running a testing sshd on a separate port is to minimize downtime, so that our group members could continue using the old sshd while I prepare the new sshd for production. It ended up causing more downtime. The measure backfired because of the false intrusion alarm.

How do they tell there is another ssh service running on suspicious port? You can just scan all open ports and see which one responds with the string "SSH-1.99-OpenSSH_4.5" that is not from port 22. I think this is pretty absurd though; I could play a prank by ssh into a legitimate host and use port forwarding to tunnel its own ssh port to any other port I want on this host, e.g. ssh hostname -R10022:hostname:22 -R11022:hostname:22 -R12022:hostname:22 and so on. This will make the system admin sweat without me violating any computing policies.

I think there are a number of morals of this story:

It is always a good idea to let others help you whenever you can. I knew about the "corrupted check bytes on input" problem, but another person helped me identify the fix because I invited everyone to test the new sshd. There are other times when I could have helped the department system admin if he were more responsive to suggestions and problem reports.
In an organization where one hand doesn't wash another, even a thoughtful plan can run into problems. In this case, a well-intended test bed triggered false intrusion alarm. These problems aren't always fatal, but they require overhead to fix. Optimistically speaking though, I think experience can help someone avoid such problems.
Don't be so keen on taking care of something unless you want the responsibility to be assumed upon you. Alternatively, I think I could have done a better job promoting to my professor the value of the additional "not my jurisdiction" work that I do. Managing your superior is a difficult art to master, but that can be necessary.

Nevertheless, the upgrade is now completed, and several people are satisfied.

Life of a Computer Scientist

Monday, March 19, 2007

Upgrade v.s. False Alarm

No comments: