Skip to main content

Zabbix vs Nagios comparison

·2 mins
Kristof Kovacs
Author
Kristof Kovacs
Software Architect & DevOps Consultant

Hello, I’m Kristof, a human being like you, and an easy to work with, friendly guy.

I've been a programmer, a consultant, CIO in startups, head of software development in government, and built two software companies.

Some days I’m coding Golang in the guts of a system and other days I'm wearing a suit to help clients with their DevOps practices.

Table of Contents

For years, I was using Nagios for server monitoring, but now I'm in the process of switching to Zabbix. I also use a third, much simpler system to monitor the main monitoring system.

Here is a practical comparison of Nagios vs Zabbix:

Zabbix #

Pros: #

  • Zabbix monitors all main protocols (HTTP, FTP, SSH, POP3, SMTP, SNMP, MySQL, etc)
  • Alerts in e-mail and/or SMS
  • Very good web interface
  • Native agent available on Windows, OS X, Linux, FreeBSD, etc
  • Multi-step web application monitoring (content, latency, speed)
  • Can visualize and compare any value it monitors
  • System "templates"
  • Monitoring of log files and reboots *
  • Local monitoring proxies **
  • Customizable dashboard screens
  • Real-time SLA reporting

Cons: #

  • Zabbix is more complex to set up
  • Escalation is a bit strange ***
  • No flapping detection
  • Documentation is spotty sometimes
  • Uses a database (like MySQL)

Nagios #

Pros: #

  • Nagios monitors all main protocols (HTTP, FTP, SSH, POP3, SMTP, SNMP, MySQL, etc)
  • Alerts in e-mail and/or SMS
  • Multiple alert levels: ERROR, WARNING, OK
  • "Flapping" detection
  • Automatic topography display
  • Completely stand-alone, no other software needed
  • Web content monitoring

Cons: #

  • Nagios needs SSH access or an addon (NRPE) to monitor remote system internals (open files, running processes, memory, etc)
  • Web interface is mostly read-only ****
  • No charting of monitored values (different systems like "Cacti" or "Nagiosgraph" can be bolted on)

* Albeit log and reboot monitoring means that one gets an "ERROR" and an "RECOVERY" message instead of one "CHANGED" or "REBOOTED" message. One gets used to it.

** For example, when there are multiple sites, each site can have its own "proxy" (local Zabbix monitor), taking load off the main Zabbix server, and collecting data even if the connection to the main server is severed.

*** It's great that higher levels of escalation get "ERROR" alerts only after some time; but in Zabbix their "RECOVERY" messages are delayed too. I don't see the point.

**** On the web admin of Nagios, one can acknowledge problems, disable alerts, and reschedule testing. But one can not add a new host or service.

Of course, both systems have much more features than what's listed here. I only wanted to list the points that I base my decision on.