Tag: apache

Nagios搭建监控服务器

Posted by – 2009-08-11

####################################
#nagios_configuration
#Author:楚霏
#Date: 2009-3-19
#Update:2009-8-11
#Env: Centos 5.3 x86_64
#感谢Sery兄的帮助
####################################

一、准备工作
####################################
环境:Centos 5.3 x86_64
所需软件:
nagios-3.1.?.tar.gz
nagios-plugins-1.4.13.tar.gz
nrpe-2.12.tar.gz
httpd-2.2.??.tar.gz
gcc
glibc
glibc-common
gd
gd-devel
fetion20080910047-lin64.tar.gz
library64_linux.tar.gz
libstdc++-4.3.0-8.x86_64.rpm
####################################

####################################
#下载相关软件
cd /usr/local/src/
wget http://osdn.dl.sourceforge.net/sourceforge/nagios/nagios-3.1.2.tar.gz
wget http://osdn.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.13.tar.gz
wget http://jaist.dl.sourceforge.net/sourceforge/nagios/nrpe-2.12.tar.gz
wget ftp://mirror.switch.ch/pool/2/mirror/fedora/linux/releases/9/Fedora/x86_64/os/Packages/libstdc++-4.3.0-8.x86_64.rpm
wget http://www.it-adv.net/fetion/downng/fetion20090406003-linux.tar.gz
wget http://www.it-adv.net/fetion/downng/library_linux.tar.gz
####################################

二、环境介绍
####################################
两台机器全是Centos 5.3 x86_64
主监控机IP=10.0.0.52
被监控机IP=10.0.0.166
主监控机上运行nagios的用户名是nagios,这个用户隶属于nagios组和运行apache的用户组

主监控机需要安装nagios,nagios-plugins,nrpe,fetion
被监控机只需要安装nagios-plugins,nrpe

支持PHP和GD的WEB环境并不是nagios必需的,主要是为了在web上看到监控状态,而nagios所带的html需要php+gd的支持

所有增减主机增减服务器操作均在主监控机上配置
主监控机上的nagios.cfg是总的配置文件,配置各个部分的配置文件的位置等信息
####################################

三、安装配置
####################################
(1)在主监控机上安装apache+php+gd的web环境,推荐编译安装,不再赘述,本处方便起见用yum装了
yum -y install gcc glibc glibc-common gd gd-devel httpd php php-gd libpng
####################################

####################################
(2)在主监控机上安装Nagios
#创建相关的用户和组
useradd -m nagios
groupadd nagcmd && usermod -a -G nagcmd nagios

#下边这条命令是使nagios用户也隶属于运行web服务器的组
usermod -a -G nagcmd apache

cd /usr/local/src/
tar xvf nagios-3.1.?.tar.gz ; cd nagios-3.1.?

#可以先看一下编译帮助
./configure --help
./configure --prefix=/usr/local/nagios --with-command-group=nagcmd
make all

#第一步执行make install安装主要的程序、CGI及HTML文件
#第二步执行make install-init的步骤,它的作用是把nagios做成一个运行脚本,使nagios随系统开机启动
#第三步执行make install-commandmode 给外部命令访问nagios配置文件的权限
#第四步执行make install-config 把配置文件的例子复制到nagios的安装目录
make install
make install-init
make install-commandmode
make install-config

#验证程序是否被正确安装上文指定的安装路径(这里是/usr/local/nagios),看是否存在etc、bin、sbin、share、var这五个目录。
#bin 执行程序所在目录,这个目录只有一个文件nagios
#etc 配置文件位置,初始安装完后,只有几个*.cfg-sample文件
#sbin Nagios Cgi文件所在目录,也就是执行外部命令所需文件所在的目录
#share Nagios网页文件所在的目录
#var Nagios日志文件、spid 等文件所在的目录
ls /usr/local/nagios
####################################

####################################
(3)配置WEB接口
#相当于httpd.conf中加了

#----------------------------引用文字-开始----------------------------
# Load config files from the config directory "/etc/httpd/conf.d".
Include conf.d/*.conf
#----------------------------引用文字-结束----------------------------

#然后在新建的/安装路径/httpd/conf.d/下新建了一个文件,内容是:

#----------------------------引用文字-开始----------------------------
# SAMPLE CONFIG SNIPPETS FOR APACHE WEB SERVER
# Last Modified: 11-26-2005
#
# This file contains examples of entries that need
# to be incorporated into your Apache web server
# configuration file. Customize the paths, etc. as
# needed to fit your system.

ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin"


# SSLRequireSSL
AuthType Basic
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
# Order deny,allow
# Deny from all
# Allow from 127.0.0.1
AuthName "Nagios Access"
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user

Alias /nagios "/usr/local/nagios/share"


# SSLRequireSSL
AuthType Basic
Options None
AllowOverride None
Order allow,deny
Allow from all
# Order deny,allow
# Deny from all
# Allow from 127.0.0.1
AuthName "Nagios Access"
AuthUserFile /usr/local/nagios/etc/htpasswd.users
Require valid-user

#----------------------------引用文字-结束----------------------------

#yum安装的apache,可用下面命令来实现
make install-webconf
#生成验证用户,
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
#在httpd.conf中的DirectoryIndex中加上index.php
#apache其它配置此处不再赘述
service httpd start
####################################

####################################
(4)安装Nagios Plugins
cd /usr/local/src/
tar xvf nagios-plugins-1.4.??.tar.gz && cd nagios-plugins-1.4.??
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
make install
####################################

####################################
(5)把Nagios增加为服务器并试运行
chkconfig --add nagios
chkconfig --level 3 nagios on

#测试一下配置文件
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

#保证nagios用户有权限运行插件
chown -R nagios:nagios /usr/local/nagios/libexec/

#如果没有错误,启动
service nagios start
####################################

####################################
(6)Nagios配置文件简介
#主配置文件nagios.cfg

#日志文件
#格式:log_file=
#例如:
#log_file=/usr/local/nagios/var/nagios.log

#对象的配置文件
#格式:cfg_file=
#例如:
#cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
#cfg_file=/usr/local/nagios/etc/objects/contactgroups.cfg

#对象的配置目录
#格式:cfg_dir =
#例如:
#cfg_dir=/usr/local/nagios/etc/switches

#Nagios用户
#格式:nagios_user=
#例如:
#nagios_user = nagios

#配置文件cgi.cfg,它是控制相关cgi脚本的

#objects(对象)是所有可监控和通知的要素。
#下边包含的配置文件主要包括
#hosts.cfg定义被监控主机
#hostgroups.cfg定义被监控主机组
#services.cfg定义服务
#servicegroups.cfg定义服务组
#contacts.cfg定义联系人
#contactgroups.cfg定义联系人组
#timeperiods.cfg定义时间期限-如24x7全天候的监测
#commands.cfg定义命令
#servicedependency定义服务依赖
#serviceescalation定义服务扩展
#hostdependency定义主机依赖
#hostescalation定义主机扩展
####################################

####################################
(7)修改配置文件
cd /usr/local/nagios/etc/
cp nagios.cfg nagios.cfg.chushibak
vi nagios.cfg
#把下面部分

#----------------------------引用文字-开始----------------------------
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg

# Definitions for monitoring the local (Linux) host
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg
#----------------------------引用文字-结束----------------------------

#修改为
#----------------------------引用文字-开始----------------------------
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/contactgroups.cfg

cfg_file=/usr/local/nagios/etc/objects/services.cfg
cfg_file=/usr/local/nagios/etc/objects/servicegroups.cfg

cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg

# Definitions for monitoring the local (Linux) host
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg
cfg_file=/usr/local/nagios/etc/objects/hostgroups.cfg
#----------------------------引用文字-结束----------------------------

####################################

####################################
(8)创建和修改对象配置文件
cd /usr/local/nagios/etc/objects
mkdir bak
mv contacts.cfg ./bak/
mv localhost.cfg ./bak/

cat << EOF >> hosts.cfg
#----------------------------引用文字-开始----------------------------

define host{
host_name 10.0.0.52
alias 10.0.0.52
address 10.0.0.52
max_check_attempts 5
#check_interval 1
#retry_interval 1
check_period 24x7
contact_groups sa_groups
notification_interval 30
#first_notification_delay #
notification_period 24x7
notification_options d,u,r
}

define host{
host_name 10.0.0.166
alias 10.0.0.166
address 10.0.0.166
max_check_attempts 5
#check_interval 1
#retry_interval 1
check_period 24x7
contact_groups sa_groups
notification_interval 30
#first_notification_delay #
notification_period 24x7
notification_options d,u,r
}
EOF
#----------------------------引用文字-结束----------------------------

cat << EOF >> hostgroups.cfg
#----------------------------引用文字-开始----------------------------
define hostgroup{
hostgroup_name all_hosts
alias all_hosts
members 10.0.0.52,10.0.0.166
#notes note_string
#notes_url url
#action_url url
}
define hostgroup{
hostgroup_name http_hosts
alias http_hosts
members 10.0.0.166
#notes note_string
#notes_url url
#action_url url
}
EOF
#----------------------------引用文字-结束----------------------------

cat << EOF >> contacts.cfg
#----------------------------引用文字-开始----------------------------
define contact{
contact_name cheng
alias sa_cheng
host_notifications_enabled 1 [0/1]
service_notifications_enabled 1 [0/1]
host_notification_period 24x7
service_notification_period 24x7
host_notification_options d,u,r
service_notification_options w,u,c,r
host_notification_commands notify-service-by-email,notify-service-by-sms
service_notification_commands notify-host-by-email,notify-host-by-sms
email yxcx@yahoo.cn
pager 13712345678
can_submit_commands 1 [0/1]
#retain_status_information [0/1]
#retain_nonstatus_information [0/1]
}
EOF
#----------------------------引用文字-结束----------------------------

cat << EOF >> contactgroups.cfg
#----------------------------引用文字-开始----------------------------
define contactgroup{
contactgroup_name sa_groups
alias sa_groups
members cheng
#contactgroup_members contactgroups
}
EOF
#----------------------------引用文字-结束----------------------------

#下边检查调用的命令(check_command),在命令配置文件中定义或在nrpe配置文件中要有定义
#最大重试次数(max_check_attempts),一般设置为3-4次比较好,这样不会因为太敏感而发生误报,一丢包就发短信太崩溃了吧
#检查间隔(check_interval)和重试检查间隔(retry_interval)的单位是分钟,不同的检查项目酌情修改
#通知间隔(notification_interval)指探测到故障以后,每隔多少分钟发送一次报警信息。
#状态级别:
#d=send notifications on a DOWN state宕
#w=send notifications on a WARNING state警告状态
#c=send notifications on a CRITICAL state严重状态、临界状态
#u=send notifications on an UNREACHABLE or UNKNOWN state找不到、不可达
#r=send notifications on recoveries (OK state)OK状态
#f=send notifications when the host or service starts and stops flapping
#s=send notifications when scheduled downtime starts and ends

cat << EOF >> services.cfg
#----------------------------引用文字-开始----------------------------

#monitor hosts
define service{
host_name 10.0.0.166
service_description check_ftp
check_command check_ftp
max_check_attempts 3
check_interval 10
retry_interval 5
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c
#contacts contacts(*)
contact_groups sa_groups
}
EOF
#----------------------------引用文字-结束----------------------------

cat << EOF >> servicegroups.cfg
#----------------------------引用文字-开始----------------------------
#monitor all_hosts
define service{
hostgroup_name all_hosts
service_description check_host-alive
check_command check_ping
max_check_attempts 5
check_interval 3
retry_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c
#contacts contacts(*)
contact_groups sa_groups
}
define service{
hostgroup_name all_hosts
service_description check_df
check_command check_nrpe!check_df
max_check_attempts 4
check_interval 1440
retry_interval 5
check_period 24x7
notification_interval 1440
notification_period 24x7
notification_options w,u,c
#contacts contacts(*)
contact_groups sa_groups
}
define service{
hostgroup_name all_hosts
service_description check_load
check_command check_nrpe!check_load
max_check_attempts 5
check_interval 5
retry_interval 5
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c
#contacts contacts(*)
contact_groups sa_groups
}
define service{
hostgroup_name all_hosts
service_description check_zombie_procs
check_command check_nrpe!check_zombie_procs
max_check_attempts 5
check_interval 5
retry_interval 5
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c
#contacts contacts(*)
contact_groups sa_groups
}
define service{
hostgroup_name all_hosts
service_description check_total_procs
check_command check_nrpe!check_total_procs
max_check_attempts 5
check_interval 5
retry_interval 5
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c
#contacts contacts(*)
contact_groups sa_groups
}
define service{
hostgroup_name all_hosts
service_description check_ssh
check_command check_ssh
max_check_attempts 3
check_interval 60
retry_interval 5
check_period 24x7
notification_interval 60
notification_period 24x7
notification_options w,u,c
#contacts contacts(*)
contact_groups sa_groups
}

#monitor http_hosts
define service{
hostgroup_name http_hosts
service_description check_http
check_command check_http
max_check_attempts 4
check_interval 3
retry_interval 1
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c
#contacts contacts(*)
contact_groups sa_groups
}
EOF
#----------------------------引用文字-结束----------------------------

####################################

####################################
(7)主监控机安装nrpe
cd /usr/local/src/
tar xvf nrpe-2.??.tar.gz && cd nrpe-2.??
./configure --prefix=/usr/local/nrpe

#编译结束后在屏幕打印出相关的一些系统信息
#----------------------------引用文字-开始----------------------------
General Options:
-------------------------
NRPE port: 5666
NRPE user: nagios
NRPE group: nagios
Nagios user: nagios
Nagios group: nagios
#----------------------------引用文字-结束----------------------------
make
make install

#复制几个插件以便nrpe正常工作
cp /usr/local/nrpe/libexec/check_nrpe /usr/local/nagios/libexec/
cp /usr/local/nagios/libexec/check_disk /usr/local/nrpe/libexec/
cp /usr/local/nagios/libexec/check_load /usr/local/nrpe/libexec/
cp /usr/local/nagios/libexec/check_ping /usr/local/nrpe/libexec/
cp /usr/local/nagios/libexec/check_procs /usr/local/nrpe/libexec/
chown -R nagios:nagios /usr/local/nrpe/libexec/

#在/usr/local/nagios/etc/objects/commands.cfg中适当位置加入下面内容,我加在check_ssh和check_dhcp中间了
vi /usr/local/nagios/etc/objects/commands.cfg
#----------------------------引用文字-开始----------------------------
# 'check_nrpe' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
#----------------------------引用文字-结束----------------------------
####################################

####################################
(8)配置nrpe
mkdir /usr/local/nrpe/etc
cp sample-config/nrpe.cfg /usr/local/nrpe/etc/

#修改下边的几个选项
#server_address=按实际情况修改
#allowed_hosts=允许被哪些机器监控
#----------------------------引用文字-开始----------------------------
server_address=127.0.0.1
allowed_hosts=127.0.0.1
#----------------------------引用文字-结束----------------------------

#命令部分根据实际情况调整,比如硬盘,此处我注释了check_hda1命令,改为全部硬盘
#----------------------------引用文字-开始----------------------------
#command[check_hda1]=/usr/local/nrpe/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_df]=/usr/local/nrpe/libexec/check_disk -w 20% -c 10%
#----------------------------引用文字-结束----------------------------

#把nrpe增加为服务
cp init-script /etc/init.d/nrpe
chmod 755 /etc/init.d/nrpe
chkconfig --add nrpe
chkconfig --level 3 nrpe on
####################################

####################################
(9)安装飞信机器人
cd /usr/local/src/
rpm -Uvh libstdc++-4.3.0-8.x86_64.rpm
tar xvf fetion20090406003-linux.tar.gz
tar xvf library_linux.tar.gz
mv install ../sms
mv libACE* /usr/local/lib64/
mv libcrypto.so.0.9.8 /usr/local/lib64/
mv libssl.so.0.9.8 /usr/local/lib64/
echo "/usr/local/lib64/" >> /etc/ld.so.conf
ldconfig
chown -R nagios:nagios /usr/local/sms
chmod 755 /usr/local/sms/fetion

#最好能切换到nagios发短信测试一下
su nagios
#13744444444发短信所用的手机号
#jiubugaosuni为13744444444密码
#13712345678改为你自己的手机号
/usr/local/sms/fetion --mobile=13744444444 --pwd=jiubugaosuni --to=13712345678 --msg-utf8=test
#别忘了回到root用户
exit

#加入短信报警的命令,我加在email部分下边了
vi commands.cfg
#----------------------------引用文字-开始----------------------------
# 'notify-host-by-sms' command definition
define command{
command_name notify-host-by-sms
command_line /usr/local/sms/fetion --mobile=13744444444 --pwd=jiubugaosuni --to=$CONTACTPAGER$ --msg-utf8="$NOTIFICATIONTYPE$ $HOSTNAME$ $SERVICEDESC$ is $SERVICESTATE$ info: $SERVICEOUTPUT$"
}
# 'notify-service-by-sms' command definition
define command{
command_name notify-service-by-sms
command_line /usr/local/sms/fetion --mobile=13744444444 --pwd=jiubugaosuni --to=$CONTACTPAGER$ --msg-utf8="$NOTIFICATIONTYPE$: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$"
}
#----------------------------引用文字-结束----------------------------

#修改contacts.cfg和contactgroups.cfg相关信息,主要是手机号
####################################

####################################
(10)重启nagios服务,验证对主监控机本身的监控情况
#测试一下配置文件,看是否有错误输出
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
service nagios restart
#用浏览器打开http://ip/nagios/看一下情况
####################################

####################################
(11)在被监控机上安装nagios-plugins和nrpe
useradd -m nagios
cd /usr/local/src/
tar xvf nagios-plugins-1.4.13.tar.gz
cd nagios-plugins-1.4.13
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
make install
cd ../
tar xvf nrpe-2.12.tar.gz
cd nrpe-2.12
./configure
make
make install
mkdir /usr/local/nagios/etc/
cp sample-config/nrpe.cfg /usr/local/nagios/etc/

#修改/usr/local/nagios/etc/nrpe.cfg下边的几个选项
#server_address=按实际情况修改
#allowed_hosts=允许被哪些机器监控
#----------------------------引用文字-开始----------------------------
server_address=10.0.0.166
allowed_hosts=127.0.0.1,10.0.0.52,10.0.0.166
#----------------------------引用文字-结束----------------------------
#命令部分根据实际情况调整,比如硬盘,此处我注释了check_hda1命令,改为全部硬盘
#----------------------------引用文字-开始----------------------------
#command[check_hda1]=/usr/local/nrpe/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_df]=/usr/local/nrpe/libexec/check_disk -w 20% -c 10%
#----------------------------引用文字-结束----------------------------
cp init-script /etc/init.d/nrpe
chmod 755 /etc/init.d/nrpe
chkconfig --add nrpe
chkconfig --level 3 nrpe on
####################################

####################################
(12)如何添加一台被监控机
#步骤:
#a.保证被监控机已经正确安装nagios-plugins和nrpe
#b.在hosts.cfg定义这台被监控机。把主机定义这部分复制粘贴后稍做修改即可
#c.在hostgroups.cfg定义这台机器应该属于哪些组
#d.需要监控的服务未在servicegroups被定义时在services.cfg中定义
####################################

####################################
(13)监控一台mysql服务器需注意
#编译nagios-plugins时需要加上--with-mysql=/usr/local/mysql(你的mysql安装路径)
#./configure --with-mysql=/usr/local/mysql --with-nagios-user=nagios --with-nagios-group=nagios
#在被监控机上做相关操作
#实际是以一个只有查询权限的用户nrpe来查询一个空数据库nrpe。功能等于mysqladmin -u 用户 --password='密码' status -i 2
mysql -p
#----------------------------引用文字-开始----------------------------
mysql> create database nrpe;
mysql> grant select on nrpe.* to nrpe@localhost identified by 'password' with grant option;
mysql> grant select on nrpe.* to nrpe@主监控机ip identified by 'password' with grant option;
#----------------------------引用文字-结束----------------------------
#试运行,会输出mysql运行情况
/usr/local/nagios/libexec/check_mysql -u nrpe -d nrpe
#在监控机所在的服务器上试运行(需要mysql_client)
/usr/local/nagios/libexec/check_mysql -H 10.0.0.166 -u nrpe -d nrpe
####################################

####################################
(14)监控一台web服务器时,可以采用nrpe来监控
#在主监控机的services.cfg中如需调用check_http命令的改为调用check_nrpe!check_http
#在被监控机中的nrpe.cfg中加下条
#----------------------------引用文字-开始----------------------------
command[check_http]=/usr/local/nagios/libexec/check_http -H www.chengyongxu.com -u /index.php
#----------------------------引用文字-结束----------------------------
#也就是说访问这台web服务器上的一个页面,这个页面正常说明web服务正常

freebsd6.3+apache2.0安装

Posted by – 2009-01-13

环境:freebsd 6.3
所需软件包
libiconv-1.9.2_2.tbz
libxml2-2.6.23_1.tbz
mod_security.tar.gz
awstats-6.7.tar.gz
httpd-2.0.*
perl-5.8.8.tbz
pkgconfig-0.20.tbz
安装过程
# pkg_add pkgconfig-0.20.tbz
# pkg_add libiconv-1.9.2_2.tbz
# pkg_add libxml2-2.6.23_1.tbz
# pkg_add perl-5.8.8.tbz
# tar zxvf httpd-2.0.*
# cd httpd-2.0.??
# ./configure --prefix=/usr/local/apache --enable-so --enable-rewrite
# make && make install
# cp ../mod*so /usr/local/apache/modules
# echo "/usr/local/apache/bin/apachectl start &" > /etc/rc.d/apache.sh
# pw groupadd httpweb
# pw useradd paobaapache -g httpweb -d /dev/null -p passwd
# sed -e "s/Timeout 300/Timeout 30/" -e "s/KeepAlive On/KeepAlive Off/" -e "s/-e "s/User nobody/User pabaapache/" -e "s/Group #-1/Group httpweb/" -e "s/ServerSignature On/ServerSignature Off/" -e "s/ServerTokens Full/ServerTokens Prod/" -e "s/LogLevel warn/LogLevel error/" /usr/local/apache/conf/httpd.conf > /usr/local/apache/conf/httpd.conf1