HP Operations Manager

TruCluster system information


This topic summarizes the features, notes, and recommendations for monitoring applications on a TruCluster system, and explains how to relocate the monitoring of a single-instance application during failover.

Notes and recommendations

Relocating the monitoring of a single-instance application during failover

Cluster Application Availability (CAA) is used to start a single-instance application on an individual TruCluster member and relocate it during failover to another cluster member. You can use CAA to relocate application monitoring during a failover.

For further information on CAA, see the Cluster Highly Available Applications manual in the Tru64 UNIX TruCluster documentation set. Chapter 2, Using CAA for Single-Instance Application Availability, is particularly useful.

For further information on TruCluster system administration, see the Cluster Administration manual in the Tru64 UNIX TruCluster documentation set. Tru64 UNIX documentation is available online at the following URL: http://h30097.www3.hp.com/docs/pub_page/doc_list.html

To make the application a highly available CAA resource

  1. Create the CAA resource profile and action script for the application, either through the SysMan Menu or by using the caa_profile command.

    This step creates the /var/cluster/caa/profile/.cap and /var/cluster/caa/scripts/.scr files, respectively.

    1. Test the action script.
    2. Validate the resource profile.
    3. Register the resource with CAA.
    4. Start the resource.
  2. From the HPOM server, assign the template that you created to monitor your application to all TruCluster members.
  3. From the HPOM server, distribute this template to all cluster members.
  4. After the initial distribution of the template, use the opctemplate -d command to disable the template for all cluster members on which the application does not run.

    opctemplate -d

    This step is required only after the initial template distribution. On subsequent template distributions, the template state is maintained on the managed TruCluster members.

  5. Edit the application's action script, which has three main routines: start, stop, and check.
    1. Enable the template in start routine with the opctemplate -e command.
    2. Disable the template in stop routine with the opctemplate -d command.

View example

With these changes, if the TruCluster member on which the application is running fails, or if a particular required resource fails, CAA does the following:

Sample Application

This is a simple Tcl/Tk application named xhostname that HPOM will monitor.
#!/usr/bin/wish
set hname [exec hostname -s]
set clarg [lindex $argv 0]

wm minsize . 350 30
wm title . "$argv0 on $hname $clarg"

button .hostname -font helvb24 -text $hname -command { exit }
pack .hostname -padx 10 -pady 10

CAA Profile

The following is the CAA profile for the xhostname application.
NAME=xhostname
TYPE=application
ACTION_SCRIPT=xhostname.scr
ACTIVE_PLACEMENT=0
AUTO_START=0
CHECK_INTERVAL=60
DESCRIPTION=xhostname
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
HOSTING_MEMBERS=
OPTIONAL_RESOURCES=
PLACEMENT=balanced
REQUIRED_RESOURCES=
RESTART_ATTEMPTS=1
SCRIPT_TIMEOUT=60

CAA Action Script

The following is the CAA action script for the xhostname application. This script contains annotations to show you where you need to modify the existing code.

#!/usr/bin/ksh -p
# 
# *****************************************************************
# *															 *
# *	Copyright (c) Digital Equipment Corporation, 1991, 1998	*
# *															 *
# *   All Rights Reserved.  Unpublished rights  reserved  under   *
# *   the copyright laws of the United States.					*
# *															 *
# *   The software contained on this media  is  proprietary  to   *
# *   and  embodies  the  confidential  technology  of  Digital   *
# *   Equipment Corporation.  Possession, use,  duplication  or   *
# *   dissemination of the software and media is authorized only  *
# *   pursuant to a valid written license from Digital Equipment  *
# *   Corporation.												*
# *															 *
# *   RESTRICTED RIGHTS LEGEND   Use, duplication, or disclosure  *
# *   by the U.S. Government is subject to restrictions  as  set  *
# *   forth in Subparagraph (c)(1)(ii)  of  DFARS  252.227-7013,  *
# *   or  in  FAR 52.227-19, as applicable.					 *
# *															 *
# *****************************************************************
#
# HISTORY
# 
# @(#)$RCSfile$ $Revision$ (DEC) $Date$
# 
# This is the CAA action script for the xhostname application.
# This action script has been modified so that the monitoring
# of the application fails over along with the application.
#
# The start and stop routines of the script have been enhanced
# to enable the monitoring template in the start routine and disable
# the monitoring template in the stop routine, using the
# "opctemplate -e | -d" command.  If the enabling and
# disabling of the monitoring template was not successful
# a message is sent to the HPOM management server. In order
# that the message is sent, the opcmsgi agent should be 
# running on the managed nodes (Assign and Distribute the
# Default Digital UNIX (Tru64 UNIX) opcmsg(1|3) template onto all the 
# TruCluster nodes).
#

PATH=/sbin:/usr/sbin:/usr/bin 
export PATH 
XHOSTNAME=/usr/bin/xhostname
export DISPLAY=":0"

(You need to add this next block of code from here….
# PATH for the opctemplate and opcmsg command
OPCTEMPLATE=/usr/opt/OV/bin/OpC/opctemplate
OPCMSG=/usr/opt/OV/bin/OpC/opcmsg

# Monitoring template for the Xhostname application
# that has been assigned and distributed to all the
# TruCluster nodes.
TEMPLATE=Xhostname
….to  here)

case $1 in 
	'start') 
# Start the xhostname application
		if [ -x $XHOSTNAME ]; then 
			if $XHOSTNAME & 
			then
(You need to add this next block of code from here…..
# Check if the opctemplate command exists. Enable the template.
				if [ -x $OPCTEMPLATE ]; then
				 $OPCTEMPLATE -e $TEMPLATE
# Check if the enabling of the template was successful 
# else send a message to the HPOM management server.
				 if [ `$OPCTEMPLATE -l $TEMPLATE | 
					grep -c enabled` -ne 1 ]
				 then
# Check if the opcmsgi agent is running. This agent is needed to send
# the message to the HPOM management server.
						if [ `ps -eaf | grep -v grep | 
							grep -c opcmsgi` -ne 0 ]
						then
						 $OPCMSG appl=$TEMPLATE \
						 msg_grp=OS \
						 object=daemon \
						 msg_text="Template $TEMPLATE not enabled" \
						 sev=warning
						fi
				 fi
				fi
….to  here)

			fi
		fi
	
		exit 0
	;;
		
	'stop') 
(You need to add this next block of code from here…..
# Check if the opctemplate command exists and disable the template
		if [ -x $OPCTEMPLATE ]; then
		 $OPCTEMPLATE -d $TEMPLATE
# Check if the disabling of the template was successful else send a
# message to the HPOM management server.
		 if [ `$OPCTEMPLATE -l $TEMPLATE | grep -c disabled` -ne 1 ]
		 then
# Check if the opcmsgi agent is running. This agent is needed to send
# the message to the HPOM management server.
				if [ `ps -eaf | grep -v grep | grep -c opcmsgi` -ne 0 ]
				then
				 $OPCMSG appl=$TEMPLATE \
				 msg_grp=OS \
				 object=daemon \
				 msg_text="Unable to disable template $TEMPLATE" \
				 sev=warning
				fi
		 fi
		fi
….to  here)

# Check if the xhostname application is running and stop it.
		ps -eu 0 -o pid,command | grep -v grep |
		grep -E '/usr/bin/xhostname' | cut -f1 -d' ' | \
		xargs kill -KILL

		exit 0
	;;

	'check')

		PID=`ps -eu 0 -o command | grep -v grep | grep -E '/usr/bin/xhostname' `
		if [ -z "$PID" ] ; then
			exit 1
		fi

		exit 0
	;;

	*)
		$ECHO "usage: $0 {start|stop|check}"
		exit 1
	;; 
esac

Related Topics: