Server monitoring

Q I have three Ubuntu servers all running different services (Apache, MySQL, FTP, etc). These computers do not have very reliable hardware, so I was wondering if there is any open source software out there that can monitor multiple servers. I would prefer to get the output in a web page, so I could access it from my PDA via the internet. Will I need to hand-code it or is there anything out there ready-made?

A There are a number of programs that will do what you want, with varying degrees of sophistication. At the harder end of this range is Nagios (www.nagios.org) but something a little less complicated should be more than adequate for your needs. Monit (www.tildeslash.com/monit) is mainly intended for monitoring programs on the machine running it, though it can watch remote servers too. It's generally a good idea to run it on a different machine to the one you're monitoring, otherwise a problem on the server could also bring down the monitor, leaving you with no warning that anything had happened. Monit can be told which services to test and what to do when they fail, so you don't have to rely on remembering to check a web page to see that something is wrong - Monit can send you an email. Even better, it can execute an external program or restart the service. The latter option is intended for local services, but you could make the restart command

ssh remote.server /etc/init.d/service restart

provided you've set up key-based SSH on the remote server so that this can run without pausing for a password. Other possible external actions would be to use the xsend script from xmpppy (http://xmpppy.sourceforge.net) to send an instant message to your PDA, alerting you immediately, or send an email via an email-to-SMS gateway to alert you with a text message. It all depends on how urgently you need to know when a problem arises, and which way is most likely to reach you first. Here is an extract from a working config file

set mailserver mail.example.com
set alert me@example.com
set httpd port 2812 and allow admin:monit
check host slartibartfast with address
192.168.13.27
if failed icmp type echo count 3 with timeout 3
seconds then alert
if failed port 3306 protocol mysql with timeout
15 seconds then alert
if failed url http://example.com then alert

The first part covers global settings, including where to send email alerts and enabling the web interface. This allows connections from anywhere, but controlled by a login, so you can use your PDA wherever you are. The second block performs three tests on a remote host, and sends an email alert if any of them fails. If you want to have Monit restart services, you will need a separate block for each service, like this

check host example.com with address
192.168.1.27
if failed port 3306 protocol mysql with timeout 15 seconds
then exec "/usr/bin/ssh root@example.com /etc/init.d/mysql restart"

The exec command sends an alert too, so you'll know that the service has failed and has been restarted. Monit also has a web interface, so you can check and reassure yourself that all is well from time to time. Monit can do a lot more than watch servers: it can check CPU load and disk space or watch for changes to the contents or permissions of files or directories. The example configuration file covers most uses - simply uncomment it and edit the ones you want to use.

Back to the list