Oracle DB – Monitor with nagios using check oracle health

This tutorial explains how to set-up the check_oracle_health script (credits to Gerhard Lausser) to work on your Nagios environment on CentOS (or any RedHat based Linux). This nagios plugin allows to monitor many oracle DB parameters – like tablespaces size, session, process count, SGA pool etc. Check it out on the author’s webset. The hardest […]

by Mihail Vukadinoff

December 4, 2014

4 min read

oracle y lupa1 - Oracle DB - Monitor with nagios using check oracle health

This tutorial explains how to set-up the check_oracle_health script (credits to Gerhard Lausser) to work on your Nagios environment on CentOS (or any RedHat based Linux). This nagios plugin allows to monitor many oracle DB parameters – like tablespaces size, session, process count, SGA pool etc. Check it out on the author’s webset.
The hardest part of the setup is installing the dependent perl libraries and making modifications in the perl code for them to work.
On oracle server we need to create the monitoring user and grant rights, only the minimum necessary for the script to work.

su – oracle
sqlplus / as sysdba
 

Type the commands:

CREATE USER nagios IDENTIFIED BY account unlock;
GRANT CREATE SESSION TO nagios;
GRANT SELECT any dictionary TO nagios;
GRANT SELECT ON V_$SYSSTAT TO nagios;
GRANT SELECT ON V_$INSTANCE TO nagios;
GRANT SELECT ON V_$LOG TO nagios;
GRANT SELECT ON SYS.DBA_DATA_FILES TO nagios;
GRANT SELECT ON SYS.DBA_FREE_SPACE TO nagios;

 On Nagios server: Upload the check script: check_oracle_health (can be downloaded from the author’s site in the link above)

  • Configure using command
./configure -prefix=/usr/local/nagios/libexec -with-nagios-user=nagios
yum search libaio
yum install libaio.x86_64
  • Install the oracle client base libs, the devel package and the sqlplus client for sql
rpm -ivh oracle-instantclient12.1-basic-12.1.0.2.0-1.x86_64.rpm
rpm -ivh
       OracleInstantClient/oracle-instantclient12.1-sqlplus-12.1.0.2.0-1.x86_64.rpm
rpm -ivh
      OracleInstantClient/oracle-instantclient12.1-devel-12.1.0.2.0-1.x86_64.rpm
  • Install all dependent perl modules – DBI , DBD::Oracle

Now here comes the hard part. The DBD::Oracle perl module is not in an rpm repository, the good thing is that you can download it via CPAN. Unfortunately it doesn’t work out of the box. I had to make quite a few modifications to get it working.Follow below.

perl-DBI is already installed on nagios server, it’s an RPM pacakge

## oracle is not available in rpm so install manually
#with CPAN
perl -MCPAN -e shell
get DBD::Oracle
## make with makefile.PL
export ORACLE_HOME=/usr/lib/oracle/12.1/client64
export PATH=$PATH:$ORACLE_HOME/bin
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
 
cd .cpan/build/DBD-Oracle-1.74/
perl Makefile.PL
## needed to use
perl Makefile.PL -l
make
make test
make install
 

Now you need to copy over the included perl lib files in the “Nagios” dir included in the check_oracle_health script package.

cp -R Nagios/ /usr/lib/perl5/site_perl/
 

Up to here the basic functionality of the script should be available, however when you try the tablespace checks I got a weird error. Like: Tablespace.pm did not return a true value or: Undefined subroutine &DBD::Oracle::Server::Database::Tablespace::init_tablespaces

Make the following amendments in the code.

(add “1;” at the last line)

## tablespace to work
vi /usr/lib/perl5/site_perl/Nagios/DBD/Oracle/Server.pm
use Nagios::DBD::Oracle::Server::Database;
## then
cp -r Nagios/DBD ./
 
## add “1;” at the last line
/usr/lib/perl5/site_perl/DBD/Oracle/Server/Database/Tablespace.pm

1;
 

Now you have to configure the check command in nagios. For the command to work, it needs to have the oracle client paths added to the environment variables – ORACLE_HOME, LD_LIBRARY_PATH,PATH. To avoid all sorts of issues in passing environment variables in nagios and altering the startup script of nagios to set them (as described on the author’s site), just pass the variables directly in the command. The latest version of check_oracle_health allows that.

 ## configure the command
vi objects/commands.cfg
 
# Oracle Database check health command
define command{
command_name        check_oracle_health
command_line          /usr/bin/perl $USER1$/check_oracle_health.pl
–connect=//$HOSTADDRESS$:$ARG1$/$ARG2$   –username=$ARG3$   –password=$ARG4$
–warning=$ARG5$  –critical=$ARG5$  –mode $ARG7$  –environment
ORACLE_HOME=’/usr/lib/oracle/12.1/client64′  –environment
PATH=/usr/lib/oracle/12.1/client64/bin  –environment
LD_LIBRARY_PATH=/usr/lib/oracle/12.1/client64/lib
   }
 
# Oracle Database check health command with -name attribute
define command{
command_name       check_oracle_health_name
command_line         /usr/bin/perl   $USER1$/check_oracle_health.pl
–connect=//$HOSTADDRESS$:$ARG1$/$ARG2$ –username=$ARG3$ –password=$ARG4$
–warning=$ARG5$ –critical=$ARG5$ –mode $ARG7$ –name $ARG8$ –environment ORACLE_HOME=’/usr/lib/oracle/12.1/client64′ –environment PATH=/usr/lib/oracle/12.1/client64/bin –environment LD_LIBRARY_PATH=/usr/lib/oracle/12.1/client64/lib
   }
 

It’s good to have a command without thresholds too and use depending on the services you’re using that the check_oracle_health script provides. For example if you’re checking for free space the cmd will give you “are you sure you want to be notified if free space is gets above threshold

# Oracle Database check health command without thresholds
define command{
command_name      check_oracle_health_nothres
command_line        /usr/bin/perl $USER1$/check_oracle_health.pl
–connect=//$HOSTADDRESS$:$ARG1$/$ARG2$ –username=$ARG3$ –password=$ARG4$
–mode $ARG5$
–environment ORACLE_HOME=’/usr/lib/oracle/12.1/client64′ –environment PATH=/usr/lib/oracle/12.1/client64/bin –environment LD_LIBRARY_PATH=/usr/lib/oracle/12.1/client64/lib
   }
 
## add the service checks for the staging oracle server
define service {
     use                                                    generic-service
host_name                                      oracledbtest
service_description                    Oracle sessions
### check_oracle_health.pl —        PORT  !      SID  !    USER   !   Pass  !   Warn  !   crit  !
mode
check_command
check_oracle_health!1521!YOOURSID!nagios!!60!80!session-usage
}
 
### Change last argument to the different modes
tns-ping , process-usage , sga-shared-pool-free
 

You can add all the modes you want. You can manually test the command like so. It’s very important that the ORACLE_HOME and other variables are set. They can be passed to the command directly, this way you don’t need to modify your nagios startup script to set the vars in the nagios user enviornment.

./check_oracle_health.pl –connect=//X.X.X.X:1521/YOOURSID –username=nagios -password=XXXX –mode=tablespace-free –name NAMEOFTBLSPC –environment ORACLE_HOME=’/usr/lib/oracle/12.1/client64′ –environment PATH=$PATH:$ORACLE_HOME/bin –environment LD_LIBRARY_PATH=$ORACLE_HOME/lib
 

Share your tips and thoughts in the comments below.