centos6 64bit nagios的安装

发布时间：2020-06-20 11:48:05 来源：网络阅读：1407 作者：liveforlinux 栏目：移动开发
参照netseek的pdf，centos6 64bit
nagios 安装步骤 
1在做安装之前确认要对该机器拥有root 权限。 
确认你安装好的linux 系统上已经安装如下软件包再继续。 
Apache 
GCC 编译器 
GD库与开发库 
yum -y install httpd gcc glibc glibc-common gd gd-devel 
  
2  
建立nagios 账号 
/usr/sbin/useradd nagios  && passwd nagios  
创建一个用户组名为nagcmd用于从Web 接口执行外部命令 
用户都加到这个组中 
/usr/sbin/groupadd  nagcmd 
/usr/sbin/usermod ‐ G nagcmd nagios 
/usr/sbin/usermod ‐ G nagcmd apache 
 
3 
下载nagios 和插件程序包 
下载Nagios 和Nagios 插件的软件包( 访问http://www.nagios.org/download/站点以获得最 
新版本) 
cd  /usr/local/src  
wget  http://nchc.dl.sourceforge.net/sourceforge/nagios/nagios-3.0.6.tar.gz 
wget  http://nchc.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.13.tar.gz  
 
4 
编译与安装nagios  
cd  /usr/local/src  
tar zxvf  nagios-3.0.6.tar.gz 
cd  nagios-3.0.6 
./configure --with-command-group=nagcmd  --prefix=/usr/local/nagios  
make all  
make install  
make install-init  
make install-config  
make install-commandmode  
 
验证程序是否被正确安装。切换目录到安装路径（这里是/usr/local/nagios）,看是否存在  
etc、bin、 sbin、 share、 var 这五个目录，如果存在则可以表明程序被正确的安装到系 
统了。后表是五个目录功能的简要说明：  
 
5 
编译并安装nagios 插件 nagios-plugins  
cd  /usr/local/src  
tar zxvf  nagios-plugins-1.4.13.tar.gz  
cd  nagios-plugins-1.4.13  
./configure --with-nagios-user=nagios --with-nagios-group=nagios  --prefix=/usr/local/nagios  
 
make && make install  
 验证： 
ls  /usr/local/nagios/libexec 
会显示安装的插件文件,即所有的插件都安装在 libexec 这个目录下 
 
6配置WEB 接口 
方法一：直接在安装nagios 时 make install ‐ webconf 
创建一个nagiosadmin的用户用于Nagios 的WEB 接口登录。记下你所设置的登录口 
令，一会儿你会用到它。 
htpasswd ‐ c /usr/local/nagios/etc/htpasswd.users nagiosadmin 
重启Apache服务以使设置生效。 
service  httpdrestart  
方法二：在httpd.conf最后添加如下内容： 
#for nagios 
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin  
<Directory "/usr/local/nagios/sbin">  
    Options ExecCGI  
    AllowOverride None  
    Order allow,deny  
    Allow from all  
    AuthName "Nagios Access"  
    AuthType Basic  
    AuthUserFile /usr/local/nagios/etc/htpasswd  
    Require valid-user  
</Directory>  
  
Alias /nagios /usr/local/nagios/share 
 
<Directory "/usr/local/nagios/share">  
    Options None  
    AllowOverride None  
    Order allow,deny  
    Allow from all  
    AuthName "Nagios Access"  
    AuthType Basic  
    AuthUserFile /usr/local/nagios/etc/htpasswd  
    Require valid-user  
</Directory>  
 
htpasswd ‐ c /usr/local/nagios/etc/htpasswd  test  
New  password: (输入123456)  
Re‐ type  new  password: (再输入一次密码) 
Adding  password  for user  test  
查看认证文件的内容 
less  /usr/local/nagios/etc/htpasswd  
test:OmWGEsBnoGpIc 前半部分是用户名test, 后面是加密后的密码 
 
本例添加的是 test  用户名，需要改 cgi.cfg  配置文件，允许test 用户 
vi /usr/local/nagios/etc/cgi.cfg 
    authorized_for_system_information=test   
    authorized_for_configuration_information=test 
    authorized_for_system_commands=test   
    authorized_for_all_services=test  
    authorized_for_all_hosts=nagiosadmin,test   
    authorized_for_all_ service_commands=test   
    authorized_for_all_host_commands=test   
 
7 
启动nagios  
把Nagios 加入到服务列表中以使之在系统启动时自动启动 
chkconfig ‐‐ add  nagios  
chkconfig  nagios  on 
验证Nagios 的样例配置文件 
/usr/local/nagios/bin/nagios ‐ v /usr/local/nagios/etc/nagios.cfg  
有可能 
Nagios 3.0.6 
Copyright (c) 1999-2008 Ethan Galstad (http://www.nagios.org) 
Last Modified: 12-01-2008 
License: GPL 
Error: Cannot open main configuration file '/usr/local/‐' for reading! 然后赋予权限也不行 直接重启nagios服务 启动即可 
Nagios 3.0.6 starting... (PID=2821) 
Local time is Thu Feb 16 14:24:25 CST 2012 
Bailing out due to one or more errors encountered in the configuration files. Run Nagios from the command line with the -v option to verify your config before restarting. (PID=2821) 
 
如果没有报错，可以启动Nagios 服务 
service  nagios  start  
service  httpd   start 
8 setenforce 0(执行这个命令就可了） 
 
令SELinux处于容许模式 
setenforce 0  
如果要永久性更变它，需要更改/etc/selinux/config 里的设置并重启系统。 
不关闭SELinux或是永久性变更它的方法是让 CGI 模块在SELinux下指定强制目标模式： 
chcon‐ R‐ t httpd_sys_content_t /usr/local/nagios/sbin/ 
chcon‐ R‐ t httpd_sys_content_t /usr/local/nagios/share/  
 
9 
测试 
登录 http://localhost/nagios/  输入用户名test和密码123456就可以正常登录了 
 
十 如何配置监控远程主机 
1 在被监控主机上 
增加用户 
useradd nagios 
设置密码 
passwd nagios 
安装nagios插件 
wget  http://nchc.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.13.tar.gz 
tar zxvf nagios-plugins-1.4.13.tar.gz  
cd nagios-plugins-1.4.13 
./configure  
make 
make install 
chown nagios.nagios /usr/local/nagios/ 
chown  -R  nagios.nagios /usr/local/nagios/libexec/ 
 
 
2 nagios 安装nrpe的时候步骤（监控与被监控都要安装） 
tar -zxvf  nrpe-2.8.1.tar.gz  
cd  nrpe-2.8.1 
./configure  
make all  
make install-plugin 
make install-daemon 
make install-daemon-config 
 
3 vim /usr/local/nagios/etc/nrpe.cfg 
#allowed_hosts=127.0.0.1 
allowed_hosts=127.0.0.1,192.168.1.130（192.168.1.130监控端的地址） 
 
改/etc/hosts.allow增加监控机ip 
 
echo 'nrpe:192.168.1.130' >> /etc/hosts.allow  
4启动服务 
/usr/local/nagios/bin/nrpe -c  /usr/local/nagios/etc/nrpe.cfg -d 
测试nrpe服务是否正常 
/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1（用127.0.0.1测试 不要用localhost测试） 
NRPE v2.8.1 
 
5在监控端（192.168.1.130）测试 看到如下结果说明成功 
/etc/init.d/iptables stop(或者添加允许从被监控端收集信息） 
/usr/local/nagios/libexec/check_nrpe -H 192.168.1.129 
NRPE v2.8.1 
 
 
然后在监控端 
1 vim /usr/local/nagios/etc/objects/129.cfg  内容如下 
define host{ 
 
use            linux-server 
 
host_name    129 
 
alias        129 
 
address      192.168.1.129 
 
} 
 
define service{ 
 
use generic-service 
 
host_name 129 
 
service_description load 
 
check_command check_nrpe!check_load 
#使用自定参数 
#check_command check_nrpe!check_load!6.0,5.0,4.0!15.0,8.0,6.0 
} 
 
vim /usr/local/nagios/etc/nagios.cfg 添加如下内容 
# Definitions for monitoring 192.168.1.129 
cfg_file=/usr/local/nagios/etc/objects/129.cfg       
 
 
vim /usr/local/nagios/etc/objects/commands.cfg 
# 'check_nrpe ' command definition 
define command{ 
 
command_name check_nrpe 
 
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ 
 
} 
 
监控机nagios重启 
service nagios reload 
输入http://192.168.1.130/nagios 就可看到129已经添加成功 
 
 
 
nagios监控swap 
在被监控机的/usr/local/nagios/etc/nrpe.cfg 
vim /usr/local/nagios/etc/nrpe.cfg添加 
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10% 
nrpe服务重启 
[root@localhost libexec]# ps -ef | grep nrpe                  
nagios    2332     1  0 14:24 ?        00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 
root      2373 28887  0 14:25 pts/0    00:00:00 grep nrpe 
kill -9 2332 
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 
 
监控端 
/usr/local/nagios/etc/objects/commands.cfg添加 
# check_swap command definition 
define command{ 
        command_name    check_swap 
        command_line    $USER1$/check_swap -w $ARG1$ -c $ARG2$ 
        } 
 
 
 
在下面的文件中 
vim /usr/local/nagios/etc/objects/129.cfg添加  
define service{ 
use generic-service 
host_name 129 
service_description swap 
check_command check_nrpe!check_swap 
} 
 
 
重启nagios服务和http服务               
service nagios restart 
service httpd restart              
 
nagios监控磁盘 
 
在被监控机的/usr/local/nagios/etc/nrpe.cfg 
vim /usr/local/nagios/etc/nrpe.cfg添加 
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p / 
nrpe服务重启 
[root@localhost libexec]# ps -ef | grep nrpe                  
nagios    2332     1  0 14:24 ?        00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 
root      2373 28887  0 14:25 pts/0    00:00:00 grep nrpe 
kill -9 2332 
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 
 
监控端 
/usr/local/nagios/etc/objects/commands.cfg添加 
define command{ 
        command_name    check_disk 
        command_line    $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ 
        } 
 
 
 
 
在下面的文件中 
vim /usr/local/nagios/etc/objects/129.cfg添加  
 
define service{ 
use generic-service 
host_name 129 
service_description disk 
check_command check_nrpe!check_disk 
} 
 
重启nagios服务和http服务               
service nagios restart 
service httpd restart              
 
nagios监控内存 
监控内存脚本如下 
###################################### 
#!/bin/bash 
# check memory script  
 
TOTAL=`free -m | head -2 |tail -1 |gawk '{print $2}'` 
USED=`free -m | head -2 |tail -1 |gawk '{print $3}'` 
FREE=`free -m | head -2 |tail -1 |gawk '{print $4}'` 
# to calculate free percent 
# use the expression free * 100 / total 
FREETMP=`expr $FREE \* 100` 
PERCENT=`expr $FREETMP / $TOTAL` 
echo "$TOTAL MB Total Memory" 
echo "$USED MB Used Memory" 
echo "$FREE MB ($PERCENT%) Free Memory" 
exit 0 
###################################### 
 
在被监控机的/usr/local/nagios/etc/nrpe.cfg 
vim /usr/local/nagios/etc/nrpe.cfg添加 
command[check_mem]=/usr/local/nagios/libexec/check_mem -w 150 -c 200 
把监控脚本check_mnem放到/usr/local/nagios/libexec/ 并赋予执行权限 
chmod +x /usr/local/nagios/libexec/check_mem 
chown nagios.nagios /usr/local/nagios/libexec/check_mem 
 
nrpe服务重启 
[root@localhost libexec]# ps -ef | grep nrpe                  
nagios    2332     1  0 14:24 ?        00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 
root      2373 28887  0 14:25 pts/0    00:00:00 grep nrpe 
kill -9 2332 
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 
 
监控端 
/usr/local/nagios/etc/objects/commands.cfg添加 
define command{ 
       command_name    check_mem 
        command_line    $USER1$/check_mem -w $ARG1$ -c $ARG2$ 
        } 
 
 
 
 
在下面的文件中 
vim /usr/local/nagios/etc/objects/129.cfg添加  
 
define service{ 
 
use generic-service 
 
host_name 129 
 
service_description memory 
 
check_command check_nrpe!check_mem 
} 
 
重启nagios服务和http服务               
service nagios restart 
service httpd restart        
 
       
 
nagios监控http存活状态 
被监控机不需要任何操作（因为check_http不需要通过nrpe来监控） 
 
 
监控端 
/usr/local/nagios/etc/objects/commands.cfg已经存在check_http命令 故也不需要操作 
 
 
 
 
在下面的文件中 
vim /usr/local/nagios/etc/objects/129.cfg添加  
 
define service{ 
 
use generic-service 
 
host_name 129 
 
service_description http 
 
check_command check_http（这一行要注意 不是check_nrpe!check_http这种形式） 
} 
 
重启nagios服务和http服务               
service nagios restart 
service httpd restart              
 
 
错误解决方法 因为http是采用yum安装的 网站文件路径默认是/var/www/html 
执行下面命令检测时 
/usr/local/nagios/libexec/check_http -I 192.168.1.129 
报错如下 
HTTP WARNING: HTTP/1.1 403 Forbidden 
原因这是因为/var/www/html 下面没有文件所致 
cd /var/www/html 
echo 123 >index.html 
然后过一会 nagios检测就ok了 
 
nagios监控mysql存活状态 
被监控机登录数据库授权 
 
mysql> grant all privileges on *.* to xxxxx@192.168.1.130 identified by '123456'; 
Query OK, 0 rows affected (0.09 sec) 
 
mysql> flush privileges; 
Query OK, 0 rows affected (0.08 sec) 
 
监控端 
/usr/local/nagios/etc/objects/commands.cfg添加如下内容 
 
# check_mysql command definition 
define command{ 
        command_name    check_mysql 
        command_line    $USER1$/check_mysql -H $HOSTADDRESS$  -P $ARG1$ - 
u $ARG2$  -p $ARG3$  (liuyu那个pdf有问题） 
        } 
 
 
 
在下面的文件中 
vim /usr/local/nagios/etc/objects/129.cfg添加  
 
define service{ 
 
use generic-service 
 
host_name 129 
 
service_description mysql 
 
check_command check_mysql!192.168.1.129!3306!xxxx!123456（这一行liuyu文档上是对的  这一行要注意 不是check_nrpe!check_http这种形式） 
 
notifications_enabled  0 
 
 
} 
重启nagios服务和http服务               
service nagios restart 
service httpd restart              
 
 
nagios监控tomcat存活状态 
被监控机不需要任何操作（因为check_tcp!8080不需要通过nrpe来监控） 
 
 
监控端 
/usr/local/nagios/etc/objects/commands.cfg已经存在check_tcp命令 故也不需要操作 
 
 
 
 
在下面的文件中 
vim /usr/local/nagios/etc/objects/hong221.cfg添加  
define service{ 
 
use generic-service 
 
host_name hong221 
 
service_description tomcat 
 
check_command check_tcp!8080!xxxxx 
 
} 
 
收到检测 执行下面命令  
[root@nagios objects]# /usr/local/nagios/libexec/check_tcp -H xxxxx -p 8080 
TCP OK - 0.141 second response time on port 8080|time=0.141140s;;;0.000000;10.000000 
 
重启nagios服务和http服务               
service nagios restart 
service httpd restart              
 
然后在监控端就可以看到监控页面了 
 
nagios配置139邮箱报警 
关于mail发送邮件139邮箱收不到的解决办法 
tail -f /var/log/maillog 日志报错如下 
Feb 21 17:20:49 localhost postfix/qmgr[2072]: A296612227F: from=<root@localhost.localdomain>, size=700, nrcpt=1 (queue active) 
Feb 21 17:20:49 localhost sendmail[2275]: q1L9KmDa002275: to=xxxxx@139.com, ctladdr=root (0/0), delay=00:00:01, xdelay=00:00:0 
0, mailer=relay, pri=30221, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (Ok: queued as A296612227F) 
Feb 21 17:20:49 localhost postfix/smtpd[2276]: disconnect from localhost.localdomain[127.0.0.1] 
Feb 21 17:20:50 localhost postfix/smtp[2280]: A296612227F: to=<xxxxx@139.com>, relay=mx1.mail.139.com[221.176.9.178]:25, delay 
=0.53, delays=0.05/0.01/0.24/0.23, dsn=5.0.0, status=bounced (host mx1.mail.139.com[221.176.9.178] said: 550 985a4f43618db72-3c5de Mail rejected (in reply to end of DATA command)) 
Feb 21 17:20:50 localhost postfix/cleanup[2279]: 43FB812227E: message-id=<20120221092050.43FB812227E@localhost.localdomain> 
Feb 21 17:20:50 localhost postfix/qmgr[2072]: 43FB812227E: from=<>, size=2697, nrcpt=1 (queue active) 
Feb 21 17:20:50 localhost postfix/bounce[2281]: A296612227F: sender non-delivery notification: 43FB812227E 
Feb 21 17:20:50 localhost postfix/qmgr[2072]: A296612227F: removed 
 
经指点是由于hostname（localhost.localdomain）的问题 可能会被139邮箱当做垃圾邮件 
[root@nagios objects]# cat /etc/sysconfig/network 
NETWORKING=yes 
#HOSTNAME=localhost.localdomain 
HOSTNAME=nagios.localdomain 
 
[root@nagios objects]# cat /etc/hosts 
192.168.1.130   nagios.localdomain      nagios  # Added by NetworkManager 
127.0.0.1       localhost.localdomain   localhost 
::1     nagios.localdomain      nagios  localhost6.localdomain6 localhost6 
 
故随便改了一个名字 然后重启服务器发现可以使用了 139邮箱也能收到邮件了 
 
 
 
关于服务报警nagios方面的配置 
监控机上 
vim /usr/local/nagios/etc/objects/contacts.cfg  
define contact{ 
        contact_name                    nagiosadmin             ; Short name of user 
        use                             generic-contact         ; Inherit default values from generic-contact template (defined abov 
e) 
        alias                           Nagios Admin            ; Full name of user 
        service_notification_period     24x7 
        host_notification_period        24x7 
        service_notification_options    w,u,c,r 
        host_notification_options       d,u,r 
        service_notification_commands   notify-service-by-email 
        host_notification_commands      notify-host-by-email 
        email                           xxxxx@139.com（写上你要发送到的邮箱里面 139邮箱运维必备）     ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ****** 
        } 
         
define contactgroup{ 
        contactgroup_name       admins 
        alias                   Nagios Administrators 
        members                 nagiosadmin 
        } 
         
然后重启nagios服务即可 
service nagios restart 
注意在主机配置文件中 有下面语句的服务出了问题才会报警 
notifications_enabled           1  （1是报警 0为不报警） 
 
 
 
 
注意申请139邮箱的时候短信要选长格式的  
邮件到达通知 要改成24小时的 
 
 
vim templates.cfg 
define service{ 
        name                            generic-service         ; The 'name' of this service template 
        active_checks_enabled           1                       ; Active service checks are enabled 
        passive_checks_enabled          1                       ; Passive service checks are enabled/accepted 
        parallelize_check               1                       ; Active service checks should be parallelized (disabling this can l 
ead to major performance problems) 
        obsess_over_service             1                       ; We should obsess over this service (if necessary) 
        check_freshness                 0                       ; Default is to NOT check service 'freshness' 
        notifications_enabled           1                       ; Service notifications are enabled 
        event_handler_enabled           1                       ; Service event handler is enabled 
        flap_detection_enabled          1                       ; Flap detection is enabled 
        failure_prediction_enabled      1                       ; Failure prediction is enabled 
        process_perf_data               1                       ; Process performance data 
        retain_status_information       1                       ; Retain status information across program restarts 
        retain_nonstatus_information    1                       ; Retain non-status information across program restarts 
        is_volatile                     0                       ; The service is not volatile 
        check_period                    24x7                    ; The service can be checked at any time of the day 
        max_check_attempts              3                       ; Re-check the service up to 3 times in order to determine its final 
 (hard) state 
        normal_check_interval           10                      ; Check the service every 10 minutes under normal conditions 
        retry_check_interval            2                       ; Re-check the service every two minutes until a hard state can be d 
etermined 
        contact_groups                  admins                  ; Notifications get sent out to everyone in the 'admins' group 
        notification_options            w,u,c,r                 ; Send notifications about warning, unknown, critical, and recovery  
events 
        notification_interval           10  （这个就是间隔多少时间发一次报警信息）                    ; Re-notify about service problems every hour 
        notification_period             24x7                    ; Notifications can be sent out at any time 
         register                        0                      ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEM 
PLATE! 
        } 
 
 
 
         
nagios相关错误解决方法 
 
错误解决方法 
一  当新增加一台监控主机（举例为129的load)监控项 
点击Scheduling Queue--129load时 Status Information ：这一项提示为CHECK_NRPE: Socket timeout after 10 seconds 
检查 
 
1 首先在监控主机上 执行 
/usr/local/nagios/libexec/check_nrpe -H 192.168.1.129  
看能不能得到NRPE的版本号 
然后查看iptables是否有相关限制 
 
2 查看文件权限 
cd /usr/local/nagios/etc/objects 
[root@localhost objects]# ll 
total 52 
-rw-r--r-- 1 root   root     314 Feb 16 15:58 129.cfg 
-rwxrwxrwx 1 nagios nagios  7856 Feb 16 16:06 commands.cfg 
-rwxrwxrwx 1 nagios nagios  2166 Feb 16 13:58 contacts.cfg 
-rwxrwxrwx 1 nagios nagios  5403 Feb 16 13:58 localhost.cfg 
-rwxrwxrwx 1 nagios nagios  3124 Feb 16 13:58 printer.cfg 
-rwxrwxrwx 1 nagios nagios  3293 Feb 16 13:58 switch.cfg 
-rwxrwxrwx 1 nagios nagios 10812 Feb 16 13:58 templates.cfg 
-rwxrwxrwx 1 nagios nagios  3209 Feb 16 13:58 timeperiods.cfg 
-rwxrwxrwx 1 nagios nagios  4007 Feb 16 13:58 windows.cfg 
 
看看新增加的这个监控主机文件权限是不是nagios用户可读可写 不可以的话参照其他文件修改如下 
[root@localhost objects]# ll 
total 52 
-rwxrwxrwx 1 nagios nagios   314 Feb 16 15:58 129.cfg 
-rwxrwxrwx 1 nagios nagios  7856 Feb 16 16:06 commands.cfg 
-rwxrwxrwx 1 nagios nagios  2166 Feb 16 13:58 contacts.cfg 
-rwxrwxrwx 1 nagios nagios  5403 Feb 16 13:58 localhost.cfg 
-rwxrwxrwx 1 nagios nagios  3124 Feb 16 13:58 printer.cfg 
-rwxrwxrwx 1 nagios nagios  3293 Feb 16 13:58 switch.cfg 
-rwxrwxrwx 1 nagios nagios 10812 Feb 16 13:58 templates.cfg 
-rwxrwxrwx 1 nagios nagios  3209 Feb 16 13:58 timeperiods.cfg 
-rwxrwxrwx 1 nagios nagios  4007 Feb 16 13:58 windows.cfg
向AI问一下细节
centos6 64bit nagios的安装

猜你喜欢

最新资讯

相关推荐

相关标签