Kristof Kovacs

Software architect, consultant

Zabbix vs Nagios comparison

For years, I was using Nagios for server monitoring, but now I'm in the process of switching to Zabbix. I also use a third, much simpler system to monitor the main monitoring system.

Here is a practical comparison of Nagios vs Zabbix:

Zabbix

Pros:

  • Zabbix monitors all main protocols (HTTP, FTP, SSH, POP3, SMTP, SNMP, MySQL, etc)
  • Alerts in e-mail and/or SMS
  • Very good web interface
  • Native agent available on Windows, OS X, Linux, FreeBSD, etc
  • Multi-step web application monitoring (content, latency, speed)
  • Can visualize and compare any value it monitors
  • System "templates"
  • Monitoring of log files and reboots *
  • Local monitoring proxies **
  • Customizable dashboard screens
  • Real-time SLA reporting

Cons:

  • Zabbix is more complex to set up
  • Escalation is a bit strange ***
  • No flapping detection
  • Documentation is spotty sometimes
  • Uses a database (like MySQL)

Nagios

Pros:

  • Nagios monitors all main protocols (HTTP, FTP, SSH, POP3, SMTP, SNMP, MySQL, etc)
  • Alerts in e-mail and/or SMS
  • Multiple alert levels: ERROR, WARNING, OK
  • "Flapping" detection
  • Automatic topography display
  • Completely stand-alone, no other software needed
  • Web content monitoring

Cons:

  • Nagios needs SSH access or an addon (NRPE) to monitor remote system internals (open files, running processes, memory, etc)
  • Web interface is mostly read-only ****
  • No charting of monitored values (different systems like "Cacti" or "Nagiosgraph" can be bolted on)

* Albeit log and reboot monitoring means that one gets an "ERROR" and an "RECOVERY" message instead of one "CHANGED" or "REBOOTED" message. One gets used to it.

** For example, when there are multiple sites, each site can have it's own "proxy" (local Zabbix monitor), taking load off the main Zabbix server, and collecting data even if the connection to the main server is severed.

*** It's great that higher levels of escalation get "ERROR" alerts only after some time; but in Zabbix their "RECOVERY" messages are delayed too. I don't see the point.

**** On the web admin of Nagios, one can acknowledge problems, disable alerts, and reschedule testing. But one can not add a new host or service.

Of course, both systems have much more features than what's listed here. I only wanted to list the points that I base my decision on.

-- Kristof

Shameless plug: I'm a freelance software architect (resume), have a look at my services!

What can I do to help you?

Your name:

Your email (so I can reply. Confidential.)

Message: