Saturday, April 26, 2008

Monitoring and Troubleshooting a Cluster


This chapter presents general information for monitoring and troubleshooting an HACMP for Linux configuration.
This chapter contains the following sections:
•Problem Determination Tools
•Viewing Cluster Information (clstat) in WebSMIT
•Useful Commands
•Logging Messages
•Solving Common Problems with Networks and Applications.
Problem Determination Tools
WebSMIT Problem Determination Tools menu has a set of tools for troubleshooting and recovering from problems that may arise in a cluster environment.
The Problem Determination Tools panel in WebSMIT includes:
•View Current State. WebSMIT displays cluster information using a slightly different layout and organization. Cluster components are displayed along their status. Expanding the item reveals additional information about it, including network, interfaces and active resource groups.
•HACMP Log Viewing and Management. Contains utilities that display or manage logs maintained by HACMP. These include the log file named hacmp.out, which keeps a record of all of the local cluster events as performed by the HACMP event scripts. These HACMP event scripts automate many common system administration tasks, and, in the event of a failure, will manage HACMP and system resource to provide recovery.
•Recover From HACMP Script Failure. Contains a command that HACMP will run to recover from a script failure. This is useful if the Cluster Manager is in reconfiguration due to a failed event script. Use this option after having manually fixed the error condition.
•Restore HACMP Configuration Database from Active Configuration.
Viewing Cluster Information (clstat) in WebSMIT
With HACMP 5.4.1, you can use WebSMIT to:
•Display detailed cluster information
•Navigate and view the status of the running cluster
•Configure and manage the cluster
•View graphical displays of sites, networks, nodes and resource group dependencies.
Useful Commands
You have these additional utilities:
•To view the resource group location and status, use the clRGinfo command.
•To view the service IP label information, run the ifconfig command on the node that currently owns the resource group.
For a list of commands supported in HACMP for Linux, see Command Reference in Appendix A: Command Reference and the clinfo Utility.
Logging Messages
HACMP for Linux uses the standard logging facilities for HACMP. For information about logging in HACMP, see the HACMP for AIX Troubleshooting Guide.
To troubleshoot the HACMP operations in your cluster, use the event summaries in the hacmp.out file and syslog.
The system logs messages into the following files:
•/tmp/clstrmgr.debug
•/tmp/cspoc.log
•/tmp/clappmond
•/tmp/hacmp.out
•/usr/es/adm/cluster.log
•/var/hacmp/clcomd/clcomd.log
•/var/hacmp/clcomd/clcomddiag.log
•/var/hacmp/log/clutils.log
•/usr/es/sbin/cluster/wsm/logs/wsm_smit.*
/websmit/logs/wsm_smit.*
•/usr/es/sbin/cluster/snapshots/*
Collecting Cluster Log Files for Problem Reporting
To view the system files and log files as they are collected in an archive file:
1.In WebSMIT, go to the Collect Cluster log files for Problem Reporting menu.
2.Type or select values in entry fields.
3.Use an appropriate Linux tool to extract or view the archive file. The archive file contains the log and system files.

Resetting Cluster Tunables


You can change the settings for a list of tunable values that were altered during cluster maintenance and reset them to their default settings, or installation-time cluster settings. The installation-time cluster settings are equal to the values that appear in the cluster after installing HACMP from scratch.
Note:Resetting the tunable values does not change any other aspects of the configuration, while installing HACMP removes all user-configured configuration information including nodes, networks, and resources.
To reset the cluster tunable values:
1.Stop the cluster services.
2.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
3.In WebSMIT, select Extended Configuration > Extended Topology Configuration > Configure an HACMP Cluster > Reset Cluster Tunables and press Continue.
Use this option to reset all the tunables (customizations) made to the cluster. For a list of the tunable values that will change, see the section Listing Tunable Values. Using this option returns all tunable values to their default values but does not change the cluster configuration. HACMP takes a snapshot file before resetting. You can choose to have HACMP synchronize the cluster when this operation is complete.
4.Select the options as follows and press Continue:
Synchronize Cluster Configuration
If you set this option to yes, HACMP synchronizes the cluster after resetting the cluster tunables.
5.HACMP asks: “Are you sure?”
6.Press Continue.
HACMP resets all the tunable values to their original settings and removes those that should be removed (such as the nodes’ knowledge about customized pre- and post-event scripts).
Resetting HACMP Tunable Values using the Command Line
We recommend that you use the SMIT interface to reset the cluster tunable values. The clsnapshot -t command also resets the cluster tunables. This command is intended for use by IBM support. See the man page for more information.
Listing Tunable Values
You can change and reset the following list of tunable values:
•User-supplied information.
•Network module tuning parameters, such as, failure detection rate, grace period and heartbeat rate. HACMP resets these parameters to their installation-time default values.
•Cluster event customizations, such as, all changes to cluster events. Note that resetting changes to cluster events does not remove any files or scripts that the customization use; it only removes the knowledge HACMP has of pre- and post-event scripts.
•Cluster event rule changes made to the event rules database are reset to the installation-time default values.
•HACMP command customizations made to the default set of HACMP commands are reset to the installation-time defaults.
•Automatically generated and discovered information.
Generally users cannot see this information. HACMP rediscovers or regenerates this information when the cluster services are restarted or during the next cluster synchronization.
HACMP resets the following:
•Local node names stored in the cluster definition database
•Netmasks for all cluster networks
•Netmasks, interface names and aliases for disk heartbeating (if configured) for all cluster interfaces
•SP switch information generated during the latest node_up event (this information is regenerated at the next node_up event)
•Instance numbers and default log sizes for the RSCT subsystem.
Understanding How HACMP Resets Cluster Tunables
HACMP resets tunable values to their default values under the following conditions:
•Before resetting HACMP tunable values, HACMP takes a cluster snapshot. After the values have been reset to defaults, if you want to go back to your customized cluster settings, you can restore them with the cluster snapshot. HACMP saves snapshots of the last ten configurations in the default cluster snapshot directory, /usr/es/sbin/cluster/snapshots, with the name active.x.odm, where x is a digit between 0 and 9, with 0 being the most recent.
•Stop cluster services on all nodes before resetting tunable values. HACMP prevents you from resetting tunable values in a running cluster.
In some cases, HACMP cannot differentiate between user-configured information and discovered information, and does not reset such values. For example, you may enter a service label and HACMP automatically discovers the IP address that corresponds to that label. In this case, HACMP does not reset the service label or the IP address. The cluster verification utility detects if these values do not match.
The clsnapshot.log file in the snapshot directory contains log messages for this utility. If any of the following scenarios are run, then HACMP cannot revert to the previous configuration:
•cl_convert is run automatically
•cl_convert is run manually

System Management (C-SPOC) Tasks


Use the System Management (C-SPOC) panel in WebSMIT to configure from one node the resources that are shared among nodes. System Management utility of HACMP lets you administer many aspects of the cluster and its components from one Cluster Single Point of Control (C-SPOC). By automating repetitive tasks, C-SPOC eliminates a potential source of errors, and speeds up the cluster maintenance process.
In WebSMIT, you access C-SPOC using the System Management (C-SPOC) menu.
In this panel, you can do the following tasks from one node:
•Manage HACMP services, or start and stop cluster services: Cluster Manager (clstrmgr) and Cluster Information (clinfo).
•HACMP Communication Interface Management. Manage the communication interfaces of existing cluster nodes using C-SPOC.
•HACMP Resource Group and Application Management Provides menus to manage cluster resource groups and analyze cluster applications.
•HACMP Log Viewing and Management. Manage, view, and collect HACMP log files and event summaries.
Starting HACMP Cluster Services
To start HACMP cluster services:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select System Management (C-SPOC) > Manage HACMP Services > Start HACMP Services and press Continue.
For detailed instructions, see the HACMP on AIX Administration Guide.
Stopping HACMP Cluster Services
To stop HACMP cluster services:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select System Management (C-SPOC) > Manage HACMP Services > Start HACMP Services and press Continue.
For detailed instructions, see the HACMP on AIX Administration Guide.
Managing Resource Groups and Applications
To manage resource groups and applications:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select System Management (C-SPOC) > HACMP Resource Group and Application Management and press Continue.
Viewing and Managing Logs
To view and manage logs:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select System Management (C-SPOC) > HACMP Log Viewing and Management and press Continue.
For detailed instructions, see the HACMP on AIX Administration Guide.

Viewing the Cluster Status


HACMP has a cluster status utility, the /usr/es/sbin/cluster/clstat. It reports the status of key cluster components—the cluster itself, the nodes in the cluster, the network interfaces connected to the nodes, and the resource groups on each node.
clstat is available in WebSMIT at the left side of the top-level menu. It displays an expandable list of cluster components along with their status. The cluster status display window shows information and status (up or down, online, offline or error) on cluster nodes, networks, interfaces, application servers and resource groups. For resource groups, it also shows the node on which the group is currently hosted.
Here is an example of the clstat output in WebSMIT. This is the left-hand side panel of the window:


Figure 2. clstat Output
Here is an example of the ASCII-based output from the clstat command, used on a Linux cluster with nodes named ppstest1 and ppstest2:
ppstest2:~ # /usr/es/sbin/cluster/clstat
clstat - HACMP Cluster Status Monitor
-------------------------------------
Cluster: test1234 (1148058900)
Wed May 17 16:45:41 2006
State: UP Nodes: 4
SubState: STABLE
Node: ppstest1 State: UP
Interface: tr0 (6) Address: 9.57.28.3
State: UP
Resource Group: rg1 State: On line
Node: ppstest2 State: UP
Interface: tr0 (6) Address: 9.57.28.4
State: UP
Resource Group: rg2 State: On line
Node: ppstest3 State: UP
Interface: tr0 (6) Address: 9.57.28.5
State: UP
Node: ppstest4 State: UP
Interface: tr0 (6) Address: 9.57.28.6
State: UP
Resource Group: rg3 State: On line
Resource Group: rg4 State: On line

Configuring HACMP Application Servers


To configure an application server on any cluster node:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP Application Servers > Add an Application Server and press Continue.
WebSMIT displays the Add an Application Server panel.
3.Enter field values as follows:
Server Name
Enter an ASCII text string that identifies the server. You will use this name to refer to the application server when you define resources during node configuration. The server name can include alphabetic and numeric characters and underscores. Use no more than 64 characters.
Start Script
Enter the pathname of the script (followed by arguments) called by the cluster event scripts to start the application server. (Maximum 256 characters.) This script must be in the same location on each cluster node that might start the server. The contents of the script, however, may differ.
Stop Script
Enter the pathname of the script called by the cluster event scripts to stop the server. (Maximum 256 characters.) This script must be in the same location on each cluster node that may start the server. The contents of the script, however, may differ.
4.Press Continue to add this information to the HACMP Configuration Database on the local node.
5.Add the application start, stop and notification scripts to every node in the cluster.
Verifying Application Servers
Make sure that the application start, stop and notification scripts exist and are executable on every node in the cluster. Use the cllsserv command.
For example:
ppstest2:~ # /usr/es/sbin/cluster/utilities/cllsserv
app_test2_primary /usr/local/app_start /usr/local/app_stop
ppstest2:~ # ls -l /usr/local/app_start
-rwxr--r-- 1 root root 169 May 10 22:54 /usr/local/app_start
Configuring Application Monitors
Once you configured application servers, HACMP for Linux lets you have application monitors that will check the health of the running application process, or check for the successful start of the application.
To configure application monitors:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP Application Monitoring and press Continue. A selector screen appears for Configure Process Application Monitoring and Configure Custom Application Monitoring.
3.Select the type of monitoring you want and press Continue.
4.Select the application server to which you want to add a monitor.
5.Fill in the field values and press Continue.
For additional reference information on application monitoring, its types, modes, and other information, see the HACMP for AIX Administration Guide.
Including Resources into Resource Groups
Once you configure resources to HACMP, you include them in resource groups so that HACMP can manage them as a single set. For example, if an application depends on the service IP label, you can add it to a single resource group.
HACMP manages the resources in a resource group by bringing the resource groups online and offline on their home node(s), or moving them to other nodes, if necessary for recovery.
Note:For detailed instructions on resource groups, see the HACMP for AIX Administration Guide. This guide contains descriptions of procedures in HACMP SMIT, and the options are identical to those used in WebSMIT in HACMP for Linux.
Resource Group Management: Overview
In the Extended Configuration > Extended Resource Configuration > HACMP Extended Resource Group Configuration WebSMIT screen, you can:
•Add a resource group.
•Change/Show a resource group. The system displays all previously defined resource groups. After selecting a particular resource group, you can view and change the group name, node relationship, and participating nodes (nodelist). You can also change the group’s startup, fallover and fallback policies.
•Remove a resource group.
•Change/Show resources for a resource group. Add resources, such as a service IP label for the application, or an application server, to a resource group. HACMP always activates and brings offline these resources on a particular node as a single set. If you want HACMP to activate one set of resources on one node and another set of resources on another node, create separate resource groups for each set.
•Show all resources by node for a resource group.
HACMP for Linux does not allow to change resources dynamically, that is, when HACMP cluster services are running on the nodes. To change the previously added resources, stop the cluster services.
Adding Resources to a Resource Group
To include resources into a resource group:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Change/Show All Resources and Attributes for a Resource Group and press Continue.
3.Fill in the field values and press Continue. HACMP adds the resources.
For additional information on adding or changing resources in resource groups, and for information on other resource group management tasks, see the Administration Guide.
Synchronizing the HACMP Cluster Configuration
We recommend that you do all the configuration from one node and synchronize the cluster to propagate this information to other nodes.
Use this WebSMIT option to commit and distribute your changes automatically to all of the specified nodes.
To synchronize an HACMP cluster configuration:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Verification and Synchronization and press Continue.
If you configured the cluster correctly, HACMP synchronizes the configuration. HACMP issues errors if the configuration is not valid.
Displaying the HACMP Cluster Configuration
You can ask HACMP to show you the status of different configured components. The WebSMIT options for displaying different cluster entities are grouped together with the options for adding them to the cluster.
Here are some examples of the options you have:
•Show HACMP Topology by node, by network name, or by communication interface
•Change/Show Persistent IP Labels
•Show Cluster Applications and change/show application monitors per application
•Change/Show Service IP Labels
•Show all Resources by Node or Resource Groups
•View cluster logs (In WebSMIT, it is under System Management > Log Viewing and Management)
•Show Cluster Services (whether running or not).

Configuring Service IP Labels


To add service IP labels/addresses as resources to the resource group in your cluster:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Service IP Labels/Addresses > Add a Service IP Label/Address and press Continue.
3.Fill in field values as follows:
IP Label/Address
Enter, or select from the picklist the IP label/address to be kept highly available.
Network Name
Enter the symbolic name of the HACMP network on which this Service IP label/address will be configured.
4.Press Continue after filling in all required fields.
5.Repeat the previous steps until you have configured all service IP labels/addresses for each network, as needed.

WebSMIT Tasks Overview


The main WebSMIT menu in HACMP for Linux contains the following menu items and tabs:
•Extended Configuration to configure your cluster.
•System Management (C-SPOC). C-SPOC (Cluster Single Point of Control) is an HACMP function that lets you run HACMP cluster-wide configuration commands from one node in the cluster. In HACMP for Linux, you can use System Management (C-SPOC) to start and stop the cluster services and to move, bring online and bring offline resource groups.
•Problem Determination Tools. You can customize cluster verification, view current cluster state, view logs, recover from a cluster event failure, configure error notification methods and perform other troubleshooting tasks.
•HACMP Documentation. This is the top-level tab that contains a page with links to all online and printable versions of HACMP documentation, including this guide.
Here is the top-level HACMP for Linux WebSMIT menu:



Tasks for Configuring a Basic Cluster

You configure an HACMP for Linux cluster using the Extended Configuration path in WebSMIT.
Note:In general, the sections in this guide provide a high-level overview of user interface options. See the HACMP for AIX Administration Guide for detailed procedures, field help, and recommendations for configuring each and every HACMP component.
To configure a basic cluster:
1.On one cluster node, configure a cluster name and add cluster nodes. See:
•Defining a Cluster Name
•Adding Nodes and Establishing Communication Paths
2.On each cluster node, configure all supporting networks and interfaces: serial networks for heartbeating and IP-based cluster networks for cluster communication.
Also, configure communication devices (that you must have previously defined to the operating system) to HACMP. Configure boot network interfaces (that you must have previously defined to the operating system) to HACMP. Also, configure persistent IP labels for cluster administration purposes. See:
•Configuring Serial Networks for Heartbeating
•Adding IP-Based Networks
•Configuring Communication Interfaces/Devices to HACMP
•Adding Persistent IP Labels for Cluster Administration Purposes
3.On one cluster node, configure cluster resources that will be associated with the application: service IP labels, application servers and application monitors. See:
•Configuring Resources to Make Highly Available
•Configuring Service IP Labels
•Configuring Application Servers
•Configuring Application Monitors
4.On one cluster node, include resources into resource groups. See Including Resources into Resource Groups.
5.Synchronize the cluster configuration. See Synchronizing the HACMP Cluster Configuration.
6.View the HACMP cluster configuration. See Displaying the HACMP Cluster Configuration.
7.Start the HACMP for Linux cluster services on the cluster nodes. When you do so, HACMP will activate the resource group with the application, and will start monitoring it for high availability. See Starting HACMP Cluster Services.
Defining a Cluster Name
Before starting to configure a cluster:
•Make sure that you added all necessary entries to the /etc/hosts file on each machine that will serve as a cluster node. See Planning IP Networks and Network Interfaces.
•Make sure that WebSMIT is installed and can be started on one of the nodes. See Installing and Configuring WebSMIT.
•Log in to WebSMIT. See Starting WebSMIT.
The only step necessary to configure a cluster is to assign the cluster name. When you assign a name to your cluster in WebSMIT, HACMP associates this name with the HACMP-assigned cluster ID.
To assign a cluster name and configure a cluster:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Topology Configuration > Configure an HACMP Cluster > Add/Change/Show an HACMP Cluster and press Continue.
3.Enter field values as follows:
Cluster Name
Enter an ASCII text string that identifies the cluster. The cluster name can include alphanumeric characters and underscores, but cannot have a leading numeric. Use no more than 32 characters. Do not use reserved names. For a list of reserved names see List of Reserved Words.
4.Press Continue. If you are changing an existing cluster name, restart HACMP for changes to take effect.

Understanding Cluster Network Requirements and Heartbeating


To avoid a single point of failure, the cluster should have more than one network configured. Often the cluster has both IP and non-IP based networks, which allows HACMP to use different heartbeat paths. Use the Add a Network to the HACMP Cluster WebSMIT panel to configure HACMP IP and point-to-point networks.

You can use any or all of these methods for heartbeat paths:
•Point-to-point networks
•IP-based networks, including heartbeating using IP aliases.
Launching the WebSMIT Interface
Use WebSMIT to:
•Navigate the running cluster.
•View and customize graphical displays of networks, nodes and resource group dependencies.
•View the status of any connected node (with HACMP cluster services running on the nodes).
Starting WebSMIT
For instructions on integrating WebSMIT with your Apache server, and for launching WebSMIT, see the /usr/es/sbin/cluster/wsm/README readme file. It contains sample post-install scripts with variables. Each variable is commented with an explanation of its purpose along with the possible values. You can modify the values of the variables to influence the script behavior.
To start WebSMIT:
1.Using a web browser, navigate to the secure URL of your cluster node, for instance enter the URL similar to the following:
https://..com:42267
The 42267 is the name of the port for HACMP for Linux. The entry is optional, it is only necessary if you are logging in to a server that is not part of your local network. The system asks you to log in.
2.Log in to the system and press Continue. WebSMIT starts.

HACMP - Installation Process Overview


Install the HACMP for Linux software on each cluster node (server). Perform the installation process as the root user.
Installing HACMP for Linux RPMs
Before you install, ensure that you have installed all the prerequisites for the installation. See Software Prerequisites for Installation.
To install HACMP for Linux:
1.Insert the HACMP for Linux CD ROM and install the hacmp.license.rpm RPM:
rpm -ivh hacmp.license.rpm
This RPM provides a utility that lets you accept the License Agreement for HACMP for Linux v.5.4.1, and complete the installation.
Note:You can install the HACMP for Linux documentation without accepting the License Agreement.
2.Run the HACMP installation script /usr/es/sbin/cluster/install_hacmp.
This script has two options:
-y
Lets you automatically accept the License Agreement. By specifying this flag you agree to the terms and conditions of the License Agreement and will not be prompted.
-d
Lets you specify an alternate path to the RPMs for installation, if you are not installing directly from the CD-ROM.
The /usr/es/sbin/cluster/install_hacmp installation script launches the License Agreement Program (LAP) and the License Agreement acceptance dialog appears.
3.Read and accept the License Agreement. The software places a key on your system to identify that you accepted the license agreement.
You can also accept the license without installing the rest of the filesets by running the /usr/es/sbin/cluster/hacmp_license script. You can then use the RPM tool to install the remaining RPMs.
The usr/es/sbin/cluster/install_hacmp installation script checks for the following prerequisites for the HACMP for Linux software:
•rsct.basic-x.x.x.x
•rsct.core-x.x.x.x.
•rsct.core.utils-x.x.x.x
•An appropriate version of ksh93 (such as ksh-20050202-1.ppc.rpm)
•Perl 5 (an RSCT prerequisite. perl-5.8.3 is installed with RHEL)
•src-1.3.0.1 (an RSCT prerequisite)
4.Check the required RSCT levels in the HACMP for Linux v.5.4.1 Release Notes, or in the section Software Prerequisites for Installation.

You can install HACMP for Linux when prerequisites are already installed, or together with the prerequisites.
The usr/es/sbin/cluster/install_hacmp installation script runs the rpm command to install HACMP for Linux RPMs:
rpm -ivh hacmp.*
5.Verify the installed cluster software. Verify that the RPMs have correct version numbers and other specific information. The RPMs cannot be installed when prerequisites are not installed.
6.Configure WebSMIT. See /usr/es/sbin/cluster/wsm/README for information, as well as the section Installing and Configuring WebSMIT in this chapter.
7.Read the HACMP for Linux v. 5.4.1 Release Notes /usr/es/sbin/cluster/release_notes.linux, for information that does not appear in the product documentation.
Note:You can manually install all RPMs without using the install_hacmp script.
Installing and Configuring WebSMIT
WebSMIT is a Web-based user interface that provides consolidated access to all functions of HACMP configuration and management, interactive cluster status, and the HACMP documentation.
WebSMIT is:
•Supported on Mozilla-based browsers (Mozilla 1.7.3 for AIX and FireFox 1.5.0.2),
•Supported on Internet Explorer, versions 5.0, 5.5 and 6.0.
•Requires that JavaScript is enabled in your browser.
•Requires network access between the browser and the cluster node that serves as a Web server. To run WebSMIT on a node, you must ensure HTTP(S)/SSL connectivity to that node; it is not handled automatically by WebSMIT or HACMP.
To launch the WebSMIT interface:
1.Configure and run a Web server process, such as Apache server, on the cluster node(s) to be administered.
2.See the /usr/es/sbin/cluster/wsm/README file for information on basic Web server configuration, the default security mechanisms in place when installing HACMP, and the configuration files available for customization.
You can run WebSMIT on a single node. Note that WebSMIT will be unavailable if a node failure occurs. To provide better availability, you can setup WebSMIT to run on multiple nodes. Since WebSMIT is retrieving and updating information from the HACMP cluster, that information should be available from all nodes in the cluster.
Typically, you set up WebSMIT to be accessible from the cluster’s internal network that is not reachable from the Internet.
Since the WebSMIT interface runs in a Web browser, you can access it from any platform. For information on WebSMIT security, see Security Considerations.

For more information about installing WebSMIT, see the section Installing and Configuring WebSMIT in the HACMP for AIX Installation Guide.
Integration of WebSMIT with the Apache Server on Different Linux Distributions
The WebSMIT readme file /usr/es/sbin/cluster/wsm/README contains different template files and instructions to enable you to handle variations in packaging, when integrating WebSMIT with the Apache server on different Linux distributions.
Verifying the Installed Cluster Software
After the HACMP for Linux software is installed on all nodes, verify the configuration. Use the verification functions of the RPM utility: your goal is to ensure that the cluster software is the same on all nodes.
Verify that the information returned by the rpm command is accurate:
rpm -qi hacmp.server
rpm -qi hacmp.client
rpm -qi hacmp.license
rpm -qi hacmp.doc.html
rpm -qi hacmp.doc.pdf
Each command should return information about each RPM. In particular, the Name, Version, Vendor, Summary and Description fields should contain appropriate information about each package.
HACMP modifies different system files during the installation process (such as /etc/inittab, /etc/services, and others). To view the details of the installation process, see the log file file/var/hacmp/log/ hacmp.install.log..
Example of the Installation Using RPM
Here is an example of the installation using rpm:
# rpm -iv hacmp*
Preparing packages for installation...
Cluster services are not active on this node.
hacmp.client-5.4.1.0-06128
Cluster services are not active on this node.
hacmp.server-5.4.1.0-06128
May 8 2006 22:26:18 Starting execution of /usr/es/sbin/cluster/etc/rc.init
with parameters:
May 8 2006 22:26:18 Completed execution of /usr/es/sbin/cluster/etc/rc.init
with parameters: .
Exit status = 0
Installation of HACMP for Linux is complete.
After installation, use the rpm command to view the information about the installed product:
ppstest3:~ # rpm -qa | grep hacmp
hacmp.server-5.4.1.0-06128
hacmp.client-5.4.1.0-06128
ppstest3:~ # rpm -qi hacmp.server-5.4.1.0-06128
Name : hacmp.server Relocations: (not relocatable)
Version : 5.4.1.0 Vendor: IBM Corp.

Release : 06128 Build Date: Mon May 8 21:21:09 2006
Install date: Tue May 9 13:03:20 2006 Build Host: bldlnx18.ppd.pok.ibm.com
Group : System Environment/Base Source RPM: hacmp.server-5.4.1.0-06128.nosrc.rpm
Size : 48627953 License: IBM Corp.
Signature : (none)
Packager : IBM Corp.
URL : http://www.ibm.com/systems/p/ha/
Summary : High Availability Cluster Multi-Processing - server part
Description :
hacmp.server provides the server side functions for HACMP.
Service information for this package can be found at
http://techsupport.services.ibm.com/server/cluster
Product ID 5765-G71
Distribution: (none)
Entries Added to System Directories after Installation
After you install HACMP for Linux, the installation process adds the following lines to the /usr/es/sbin/cluster/etc/inittab file:
harc:2345:once:/usr/es/sbin/cluster/etc/rc.init >/dev/console 2>&1
SRC definitions are added (run lssrc -s ):
Subsystem Group
clcomdES clcomdES
clstrmgrES cluster
topsvcs topsvcs
grpsvcs grpsvcs
Addressing Problems during Installation
If you experience problems during the installation, refer to the RPM documentation for information on a cleanup process after an unsuccessful installation and other issues.
To view the details of the installation process, see the following log file:
/var/hacmp/log/ hacmp.install.log..

Contents of the Installation Media


The HACMP for Linux installation media provides the following .rpm files:
hacmp.server-5.4.1.0.ppc.rpm
High Availability Cluster Multi-Processing—server part. hacmp.server provides the server-side functions for HACMP.
hacmp.client-5.4.1.0.ppc.rpm
High Availability Cluster Multi-Processing—client part. hacmp.client provides the client-side functions for HACMP.
hacmp.license
hacmp.license-5.4.1.0.ppc.rpm
HACMP for Linux License Package. hacmp.license provides the software License Agreement functions for the HACMP for Linux software.
hacmp.doc
hacmp.doc.html-5.4.1.0.ppc.rpm
HACMP for Linux HTML documentation—U.S. English
hacmp.doc.pdf-5.4.1.0.ppc.rpm
HACMP for Linux PDF documentation—U.S. English

Planning the HACMP Configuration


Plan to have the following components in an HACMP cluster:
•An application
•Up to eight nodes
•Resource groups
•Networks.
Planning Applications
Once you put an application under HACMP’s control, HACMP starts it on the node(s) and periodically polls the application’s status, if you define application monitors. In cases of component failures, HACMP moves the application to other nodes while the process is invisible to application’s end users.
Plan to have the following for your application:
•Customized application start and stop scripts and their locations. The scripts should contain all pre- and post-processing you want HACMP to do so that it starts and stops the applications on the nodes cleanly and according to your requirements. You define these scripts as the application server in WebSMIT.
•Customized scripts you may want to use in HACMP for monitoring the application’s successful startup, and for periodically checking the application’s running process. You define these scripts to HACMP as application monitors in WebSMIT.
•If you have a complex production environment with tiered applications that require dependencies between their startup, or a staged production environment where some applications should start only if their “supporting” applications are already running, HACMP supports these configurations by letting you configure multiple types of dependencies between resource groups in WebSMIT.
To configure a working cluster that will support such dependent applications, first plan the dependencies for all the services that you want to make highly available. For examples of such planning, see the HACMP for AIX Planning Guide and Administration Guide (sections on multi-tiered applications and resource group dependencies).

In HACMP 5.4.1, you can use WebSMIT to take an application out of HACMP’s control temporarily without disrupting it, and then restart HACMP on the nodes that currently run the application.
Planning HACMP Nodes
HACMP for Linux lets you configure up to eight HACMP nodes.
For each critical application, be mindful of the resources required by the application, including its processing and data storage requirements. For example, when you plan the size of your cluster, include enough nodes to handle the processing requirements of your application after a node fails.
Keep in mind the following considerations when determining the number of cluster nodes and planning the nodes:
•An HACMP cluster can be made up of any combination of supported workstations, LPARs, and other machines. See Hardware for Cluster Nodes. Ensure that all cluster nodes do not share components that could be a single point of failure (for example, a power supply). Similarly, do not place nodes on a single rack.
•Create small clusters that consist of nodes that perform similar functions or share resources. Smaller, simple clusters are easier to design, implement, and maintain.
•For performance reasons, it may be desirable to use multiple nodes to support the same application. To provide mutual takeover services, the application must be designed in a manner that allows multiple instances of the application to run on the same node.
For example, if an application requires that the dynamic data reside in a directory called /data, chances are that the application cannot support multiple instances on the same processor. For such an application (running in a non-concurrent environment), try to partition the data so that multiple instances of the application can run—each accessing a unique database.
Furthermore, if the application supports configuration files that enable the administrator to specify that the dynamic data for instance1 of the application reside in the data1 directory, instance2 resides in the data2 directory, and so on, then multiple instances of the application are probably supported.
•In certain configurations, including additional nodes in the cluster design can increase the level of availability provided by the cluster; it also gives you more flexibility in planning node fallover and reintegration.
The most reliable cluster node configuration is to have at least one standby node.
•Choose cluster nodes that have enough I/O slots to support redundant network interface cards and disk adapters.
Ensure you have enough cluster nodes in your cluster. Although this adds to the cost of the cluster, we highly recommend to support redundant hardware, (such as enough I/O slots for network interface cards and disk adapters). This will increase the availability of your application.
•Use nodes with similar processing speed.
•Use nodes with the sufficient CPU cycles and I/O bandwidth to allow the production application to run at peak load. Remember, nodes should have enough capacity to allow HACMP to operate.
for this, benchmark or model your production application, and list the parameters of the heaviest expected loads. Then choose nodes for an HACMP cluster that will not exceed 85% busy, when running your production application.
Planning for Resource Groups in an HACMP Cluster
To make your applications highly available in an HACMP cluster, plan and configure resource groups. Resource groups must include resources related to the application, such as its start and stop script (application server) and the service IP label for the application.
Plan the following for resource groups in HACMP for Linux:
•The nodelist for the resource groups must contain all or some nodes from the cluster. These are the nodes on which you “allow” HACMP to host your application. The first node in the nodelist is the default node, or the home node for the resource group that contains the application. You define the nodelist in WebSMIT.
•You can use any set of resource group policies for a resource group startup, fallover and fallback. In WebSMIT, HACMP lets you combine only valid sets of these policies and prevents you from configuring non-working scenarios.
•HACMP for Linux supports only non-concurrent resource groups.
•HACMP for Linux does not support the fallover policy Fallover using Dynamic Node Priority policy.
•HACMP for Linux does not support cluster sites.
•If your applications are dependent on other applications, you may need to plan for dependencies between resource groups. HACMP lets you have node-collocated resource groups, resource groups that always must reside on different nodes, and also child resource groups that do not start before their parent resource groups are active (parent/child dependencies). Make a diagram of your dependent applications to better plan dependencies that you want to configure for resource groups, and then define them in WebSMIT.
•HACMP processes the resource groups in parallel by default.
•HACMP for Linux does not allow dynamic changes to the cluster resources or resource groups (also known as dynamic reconfiguration or DARE). This means that you must stop the cluster services, before changing the resource groups or their resources.
For complete planning information, see the guidelines in Chapter 6: Planning Resource Groups in the HACMP Cluster in the HACMP for AIX Planning Guide.

Resource Group Policies: Overview
HACMP allows you to configure only valid combinations of startup, fallover, and fallback behaviors for resource groups. The following table summarizes the basic startup, fallover, and fallback behaviors you can configure for resource groups in HACMP for Linux v. 5.4.1:
Startup Behavior
Fallover Behavior
Fallback Behavior
Online only on home node (first node in the nodelist)
•Fallover to next priority node in the list
•Never fall back
or
•Fall back to higher priority node in the list
Online on first available node
Any of these:
•Fallover to next priority node in the list
•Bring offline (on error node only)
•Never fall back
or
•Fall back to higher priority node in the list
Planning IP Networks and Network Interfaces
Plan to configure the following networks and IP interfaces:
•A heartbeating IP-based network. An HACMP cluster requires at least one network that will be used for the cluster heartbeating traffic.
•A heartbeating serial network, such as RS232.
•An IP-based network that lets you connect from the application’s client machine to the nodes. The nodes serve as the application’s servers and run HACMP. To configure this network, plan to configure a client machine with a network adapter and a NIC compatible with at least one of the networks configured on the cluster nodes.
•Two HACMP cluster networks. These are TCP/IP-based networks used by HACMP for inter-node communication. HACMP utilities use them to synchronize information between the nodes and propagate cluster changes across the cluster nodes.
For each HACMP cluster network, on each cluster node plan to configure two IP labels that will be available at boot time, will be configured on different subnets, and will be used for IPAT via IP aliasing. See Planning IP Labels for IPAT via IP Aliasing.
•On the cluster node that will serve as a Web server, set up a network connection to access WebSMIT. Typically, you set up WebSMIT to be accessible from the cluster’s internal network that is not reachable from the Internet. To securely run WebSMIT on a node, you must ensure HTTP(S)/SSL connectivity to that node; it is not handled automatically by WebSMIT or HACMP. See Security Considerations.
Planning IP Labels for IPAT via IP Aliasing
IP address takeover via IP aliasing is the default method of taking over the IP address and is supported in HACMP for Linux. IPAT via IP aliasing allows one node to acquire the IP label and the IP address of another node in the cluster, using IP aliases.

To enable that IP Address Takeover via IP aliases can be used in the HACMP for Linux networks configuration, configure NICs for the two HACMP cluster networks that meet the following requirements:
•Plan to configure more than one boot-time IP label on the service network interface card on each cluster node.
•Subnet requirements:
•Multiple boot-time addresses configured on a node should be defined on different subnets.
•Service IP addresses must be configured on a different subnet from all non-service addresses (such as boot) defined for that network on the cluster node.
•Multiple service labels can coexist as aliases on a given interface.
•The netmask for all IP labels in an HACMP network must be the same.
•Manually add the IP labels described in this section into the /etc/hosts file on each node. This must be done before you proceed to configure an HACMP cluster in WebSMIT.
HACMP non-service labels are defined on the nodes as the boot-time addresses, assigned by the operating system after a system boot and before the HACMP software is started. When you start the HACMP software on a node, the node’s service IP label is added as an alias onto one of the NICs that has a non-service label.
When using IPAT via IP Aliases, the node’s NIC must meet the following conditions:
•The NIC has both the boot-time and service IP addresses configured, where the service IP label is an alias placed on the interface.
•The boot-time address is never removed from a NIC, simply an alias is added on the NIC in addition to the boot-time address.
•If the node fails, a takeover node acquires the failed node’s service address as an alias on one of its non-service interfaces on the same HACMP network. During a node fallover event, the service IP label that is moved is placed as an alias on the target node’s NIC in addition to any other service labels that may already be configured on that NIC.
When using IPAT via IP Aliases, service IP labels are acquired using all available non-service interfaces. If there are multiple interfaces available to host the service IP label, the interface is chosen according to the number of IP labels currently on that interface. If multiple service IP labels are acquired and there are multiple interfaces available, the service IP labels are distributed across all the available interfaces.
Once you install HACMP for Linux, proceed to configure WebSMIT for access to the cluster configuration user interface.

Software Prerequisites for Installation
When you install HACMP for Linux, make sure that the following software is installed on the cluster nodes:
•Red Hat™ Enterprise Linux (RHEL) 4 or SUSE™ LINUX Enterprise Server (SLES) 9 (both with latest updates).
Read the readme file for WebSMIT/usr/es/sbin/cluster/wsm/README for information on specific Apache V1 and V2 requirements, and for information on specific issues related to RHEL or SUSE Linux distribution.
•RSCT 2.4.5.2. For the latest information about RSCT levels and the latest available APARs for RSCT, check the HACMP for Linux v. 5.4.1 Release Notes.
•Apache WebServer V1 and V2 (provided with the Linux distribution).
•ksh93. A compliant version of ksh. Ensure that the ksh version you have installed is ksh93 compliant. The ksh93 environment is a prerequisite for the RHEL distribution, and HACMP for Linux checks for it prior to the installation.
You can download ksh93 from the Web. The fileset name is similar to the following: ksh-20050202-1.ppc.rpm.

Cluster Software


The HACMP for Linux cluster software can be described in these two categories:
•Software that you need sot that you can install and run the cluster. In particular, HACMP for Linux requires RSCT (IBM Reliable Scalable Cluster Technology) subsystem to be installed on the nodes. For complete information on what software you need to install, see the installation section.
•The application that you plan to make highly available with the use of HACMP. It can be a database or another service.

Planning and Installing HACMP for Linux


This chapter describes how to plan and install HACMP for Linux. It contains the following sections:
•Cluster Hardware
•Cluster Software
•Planning the HACMP Configuration
•Installing HACMP for Linux
•Contents of the Installation Media
•Installation Process Overview
•Security Considerations
•Where You Go from Here.
Cluster Hardware
This section lists examples of IBM hardware that you can use for cluster nodes, cluster networks and cluster storage disks. For complete information, see the IBM Portal on Linux website:
http://www.ibm.com/linux/
Hardware for Cluster Nodes
HACMP for Linux lets you configure up to eight HACMP nodes. You can use:
•Selected models of IBM System p™ servers
For more information, see: http://www.ibm.com/systems/p/linux/
Also, for descriptions of IBM hardware that you can use as HACMP cluster nodes in AIX, see the HACMP for AIX Planning Guide.
Hardware for Cluster Networks
HACMP for Linux supports the following interconnection networks for clusters:
•Selected modes of 10/100 Mbps Ethernet
•Selected models of Gigabit Ethernet
•Token Ring.
An Ethernet or a Token Ring network can be used as an HACMP cluster IP-based network.
Planning and Installing HACMP for Linux
Cluster Hardware

Hardware for Cluster Storage
HACMP for Linux does not provide high availability for storage resources in your cluster configuration. However you can use NFS or IBM TotalStorage disk subsystems as the storage options in your cluster.
No Automatic NFS and Volume Management
Although you can have disks and file systems configured in the same environment in which your HACMP for Linux cluster is configured, HACMP for Linux does not support NFS file systems. You cannot include file systems associated with the application into the resource groups.
This means that the file systems are not kept highly available by HACMP for Linux. In particular, during fallovers, when applications are moved to other nodes, HACMP for Linux does not automatically unmount the associated file systems on one node and mount them on the takeover node. Similarly, HACMP for Linux does not automatically perform any volume management or volume group operations for volume groups that a particular application needs to access.
However, if you want to manage storage in the cluster, you can still use NFS or GPFS to control it. To ensure that your NFS file systems work within the cluster, you must manage NFS manually, that is, completely outside of your HACMP for Linux cluster.
For example, for a two-node cluster, you can have an NFS server configured somewhere at your site, and make it to export the file system to your cluster nodes. Both nodes will need to mount the file system at boot time. The file system will be also mounted on another cluster node, the one to which the resource group may potentially fall over in cases of failures. Your application and service IP label will be running on one node. On fallover, the application and service IP label will move to the takeover node where the NFS file system has also been mounted since boot time. This way, your application has access to the file system regardless of which node is currently hosting the application. However, the NFS file systems service provided to your application is not kept highly available by HACMP.
As an alternative, here is a cluster configuration that lets you have high availability of your NFS file system in the HACMP for Linux cluster. You can configure an NFS server on a separate two-node cluster, with both nodes running HACMP for AIX, specifically, the nodes should run HACMP’s NFS component (it is part of HACMP for AIX). You can then export the file system from this highly available NFS server to the nodes of your separate HACMP for Linux cluster

Preventing Cluster Partitioning


To prevent cluster partitioning, configure a serial network for heartbeating between the nodes, in addition to the IP-based cluster network. If the IP-based cluster network connection between the nodes fails, the heartbeating network prevents data divergence and cluster partitioning.

Network Interface Failure


The HACMP software handles failures of network interfaces on which a service IP label is configured. Types of such failures are:
•Out of two network interfaces configured on the same HACMP node and network, the network interface with a service IP label fails, but an additional “backup” network interface card remains available. In this case, the Cluster Manager removes the service IP label from the failed network interface, and recovers it, via IP aliasing, on the “backup” network interface. Such a network interface failure is transparent to you except for a small delay while the system reconfigures the network interface on the node.
•Out of two network interfaces configured on a node, an additional or a “backup” network interface fails, but the network interface with a service IP label configured on it remains available. In this case, the Cluster Manager detects a (backup) network interface failure, logs the event, and sends a message to the system console. The application continues to be highly available. If you want additional processing, you can customize the processing for this event.
•If the service IP label that is part of a resource group cannot be recovered on a local node, HACMP moves the resource group with the associated IP label to another node, using IP aliasing as the mechanism to recover the associated service IP label.

How HACMP Handles Network Failures on the Local Node


A local network failure occurs when all interfaces of a specific cluster network on a node fail. For example, if you have nodes A and B, and networks net1 and net2, and all interfaces of network net1 on node A fail, then a network_down event runs for net1 with node A as the event node. You can see this in the /tmp/hacmp.out file. This is also called a local network failure.
In this case, the Cluster Manager takes selective recovery actions for resource groups containing a service IP label connected to that network. The Cluster Manager attempts to recover only the resource groups affected by the local network failure event.

Network Failure


A network failure occurs when none of the cluster nodes can access each other using any of the network interface cards configured for the HACMP network.
To protect against network failures, we recommend that you have the nodes in the cluster connected by multiple networks. If one network fails, HACMP uses a network that is still available for cluster traffic and for monitoring the status of the nodes (heartbeating).
You can also specify additional actions to process a network failure—for example, re-routing through an alternate network.

Node and Network Failure Scenarios


This section describes how HACMP for Linux handles failures and ensures that the application keeps running.
The following scenarios are considered:
•Node Failure
•Network Failure
•Network Interface Failure
•Preventing Cluster Partitioning.

Node Failure:

If the application is configured to normally run on Node1 and Node1 fails, the resource group with the application falls over, or moves, to Node2.

HACMP:Sample Configuration with a Diagram


The following configuration includes:
•Node1 and Node2 running Linux
•A serial network
•An IP-based network.

Cluster Terminology


The list below includes basic terms used in the HACMP environment.
Note:In general, terminology for HACMP is based on industry conventions for high availability. However, the meaning of some of the terms in HACMP may differ from the generic terms.
An application is a service, such as a database, or a collection of system services and their dependent resources, such as a service IP label and application’s start and stop scripts, that you want to keep highly available with the use of HACMP.
An application server is a collection of application start and stop scripts that you provide to HACMP by entering the pathnames for the scripts in the WebSMIT user interface. An application server becomes a resource associated with an application, you include it in a resource group for HACMP to keep it highly available. HACMP ensures that the application can start and stop successfully no matter on which cluster node it is being started.
A cluster node is a physical machine, typically an AIX or a Linux server on which you install HACMP. A cluster node also hosts an application. A cluster node serves as a server for application’s clients. HACMP’s role is to ensure continuous access to the application, no matter on which node in the cluster the application is currently active.
A home node is a node on which the application is hosted, based on your default configuration for the application’s resource group, and under normal conditions.
A takeover node is a backup cluster node to which HACMP may move the application. You can move the application to this node manually, for instance, to free the home node for planned maintenance. Or, HACMP moves the application automatically, due to a cluster component failure.
In HACMP for Linux v.5.4.1, a cluster configuration includes up to eight nodes. Therefore, you can have more than one potential takeover nodes for a particular application. You define the list of nodes on which you want HACMP to host your application using the WebSMIT interface. This list is called a resource group’s nodelist.
A cluster IP network is used for cluster communications between the nodes and for sending heartbeating information. All IP labels configured on the same HACMP network share the netmask, but may be required to have different subnets.
An IP label is a name of a network interface card (NIC) that you provide to HACMP. Network configuration for HACMP requires planning for several types of IP labels:
•Base (or boot) IP labels on each node—the ones through which an initial cluster connectivity is established.
•Service IP labels for each application—the ones through which a connection for a highly available application is established.
•Backup IP labels (optional).
•Persistent IP labels on each node. These are node-bound IP labels that are useful to have in the cluster for administrative purposes.

Note that to ensure high availability and access to the application, HACMP “recovers” the service IP address associated with the application on another node in the cluster in cases of network interface failures. HACMP uses IP aliases for HACMP networks. For information, see Planning IP Networks and Network Interfaces.
An IP alias is an alias placed on an IP label. It coexists on an interface along with the IP label. Networks that support Gratuitous ARP cache updates enable configuration of IP aliases.
IP Address Takeover (IPAT) is a process whereby a service IP label on one node is taken over by a backup node in the cluster. HACMP uses IPAT to provide high availability of IP service labels that belong to resource groups. These labels provide access to applications. HACMP uses IPAT to recover the IP label on the same node or the backup node. HACMP for Linux by default supports the mode of IPAT known as IPAT via IP Aliasing. (The other method of IPAT—IPAT via IP Replacement is not supported).
IP Address Takeover via IP Aliasing is the default method of IPAT used in HACMP. HACMP uses IPAT via IP Aliasing in cases when it must automatically recover a service IP label on another node. To configure IPAT via IP Aliasing, you configure service IP labels and their aliases to the system. When HACMP performs IPAT during automatic cluster events, it places an IP alias recovered from the “failed” node on top of the service IP address on the takeover node. As a result, access to the application continues to be provided.
Cluster resources can include an application server and a service IP label. All or some of these resources can be associated with an application you plan to keep highly available. You include cluster resources into resource groups.
A resource group is a collection of cluster resources.
Resource group startup is an activation of a resource group and its associated resources on a specified cluster node. You choose a resource group startup policy from a predefined list in WebSMIT.
Resource group fallover is an action of a resource group, when HACMP moves it from one node to another. In other words, a resource group and its associated application fall over to another node. You choose a resource group fallover policy from a predefined list in WebSMIT.
Takeover is an automatic action during which HACMP takes over resources from one node and moves them to another node. Takeover occurs when a resource group falls over to another node. A backup node is referred to as a takeover node.
Resource group fallback is an action of a resource group, when HACMP returns it from a takeover node back to the home node. You choose a resource group fallback policy from a predefined list in WebSMIT.
Cluster Startup is the starting of HACMP cluster services on the node(s).
Cluster Shutdown is the stopping of HACMP cluster services on the node(s).
Pre- and post-events are customized scripts provided by you (or other system administrators), which you can make known to HACMP and which will be run before or after a particular cluster event. For more information on pre- and post-event scripts, see the chapter on Planning Cluster Events in the HACMP for AIX Planning Guide.

HACMP

High Availability Cluster Multi-Processing (HACMP™) on Linux is the IBM tool for building Linux-based computing platforms that include more than one server and provide high availability of applications and services.
Both HACMP for AIX and HACMP for Linux versions use a common software model and present a common user interface (WebSMIT). This chapter provides an overview of HACMP on Linux and contains the following sections:
•Overview
•Cluster Terminology
•Sample Configuration with a Diagram
•Node and Network Failure Scenarios
•Where You Go from Here.

Overview:

HACMP for Linux enables your business application and its dependent resources to continue running either at its current hosting server (node) or, in case of a failure at the hosting node, at a backup node, thus providing high availability and recovery for the application.
HACMP detects component failures and automatically transfers your application to another node with little or no interruption to the application’s end users.
HACMP for Linux takes advantage of the following software components to reduce application downtime and recovery:
•Linux operating system (RHEL or SUSE ES versions)
•TCP/IP subsystem
•High Availability Cluster Multi-Processing (HACMP™) on Linux cluster management subsystem (the Cluster Manager daemon).
HACMP for Linux Cluster Overview
Overview
14 HACMP for Linux: Installation and Administration Guide
1

HACMP for Linux provides:

•High Availability for system processes, services and applications that are running under HACMP’s control. HACMP ensures continuing service and access to applications during hardware or software outages (or both), planned or unplanned, in an eight-node cluster. Nodes may have access to the data stored on shared disks over an IP-based network (although shared disks cannot be part of the HACMP for Linux cluster and are not kept highly available by HACMP).
•Protection and recovery of applications when components fail. HACMP protects your applications against node and network failures, by providing automatic recovery of applications.
If a node fails, HACMP recovers applications on a surviving node. If a network or a network interface card (adapter) fails, HACMP uses an alternate networks, an additional network interface or an IP label alias to recover the communication links and continue providing access to the data.
•WebSMIT, a web-based user interface to configure an HACMP cluster. In WebSMIT, you can configure a basic cluster with the most widely used, default settings, or configure a customized cluster while having the access to customizable tools and functions. WebSMIT lets you view your existing cluster configuration in different ways (node-centric view, or application-centric view) and provides cluster status tools.
•Easy customization of how applications are managed by HACMP. You can configure HACMP to handle applications in the way you want:
•Applications startup.You select from a set of options for how you want HACMP to start up applications on the node(s).
•Applications recovery actions that HACMP takes. If a failure occurs with an application’s resource that is monitored by HACMP, you select whether you want HACMP to recover applications on another cluster node, or stop the applications.
•HACMP’s follow-up after recovery. You select how you want HACMP to react in cases when you have restored a failed cluster component. For instance, you decide on which node HACMP should restart the application that was previously automatically stopped (or moved to another node) due to a previously detected resource failure.
•Built-in configuration, system maintenance and troubleshooting functions. HACMP has functions to help you with your daily system management tasks, such as cluster administration, automatic cluster monitoring of the application’s health, or notification upon component failures.
•Tools for creating similar clusters from an existing “sample” cluster. You can save your existing HACMP cluster configuration in a cluster snapshot file, and later recreate it in an identical cluster in a few steps.

WebSphere Admin Console


Many a thanks to Satya Dinesh Babu Manne, one of our customers who had found a new way to troubleshoot websphere problem. The solution [What he has basically tried was instead of trying to reuse any existing ports which seem to be having some conflicts, he has defined some new ports and transport chains] is given below:

1) In WebSphere Admin Console, Navigate to Application Servers -> Server Name -> Web Container Settings -> Web Container Transport Chains
2) In this view which shows current transport chains, click on New Button
3) In the resulting wizard at step 1, Give a new name to this chain (I gave it WC_CacheMonitor_Inbound) , and from the template Drop Down box select Webcontainer (Chain 1) and click on Next

4) In Step 2 , give this a new port name to identify it , and the host , port values, For the Port I gave 9030 when creating on instance 1 and 9032 when creating on instance 2. Click on Next.
5) In Step 3, Click on Finish button.
6) Repeat the above steps for each server in Cluster (I got 4 servers)
7) Save Configuration Changes.
Navigate to Environment -> Virtual Hosts, Click on New button
9) In the Wizard, give a new name and click on OK button.
10) In the resulting window click on the new Virtual Host created and click on Host Aliases for that Virtual Host.

11) Add the Virtual Host by making sure to reflect the Host and Port numbers (like 9030, 9032 etc) which have been already been created in the previous steps for Web Container Transport chains.
12) Save the Configuration Changes.
13) Navigate to Applications -> Enterprise Applications -> perfServletApp –> Map virtual hosts for Web modules
14) Select the newly created Virtual Host from the Drop Down.
15) Save the Configuration Changes, and restart all Servers.
16) The perfservlet is now accessible though ports 9030 and 9032 against the hosts configured

I was able to configure and test a websphere monitor after making these changes.

Oracle Management


Take Control of Oracle Monitoring



Most business critical applications are database driven. The Oracle database management capability helps database administrators to seamlessly detect, diagnose and resolve Oracle performance issues and monitor Oracle 24X7. The database server monitoring tool is an agentless monitoring software that provides out-of-the-box performance metrics and helps you visualize the health and availability of an Oracle Database server farm. Database administrators can login to the web client and visualize the status and Oracle performance metrics.


Applications Manager also provides out-of-the-box reports that help analyze the database server usage, Oracle database availability and database server health.

Additionally the grouping capability helps group your databases based on the business process supported and helps the operations team to prioritize alerts as they are received.

Some of the components that are monitored in Oracle database are:

Response Time
User Activity
Status
Table Space Usage
Table Space Details
Table Space Status
SGA Performance
SGA Details
SGA Status
Performance of Data Files
Session Details
Session Waits
Buffer Gets
Disk Reads
Rollback Segment


Note: Oracle Application Server performance monitoring is also possible in Applications Manager.
Oracle Management Capabilities
Out-of-the-box management of Oracle availability and performance.
Monitors performance statistics such as user activity, status, table space, SGA performance, session details, etc. Alerts can be configured for these parameters.
Based on the thresholds configured, notifications and alerts are generated. Actions are executed automatically based on configurations.
Performance graphs and reports are available instantly. Reports can be grouped and displayed based on availability, health, and connection time.
Delivers both historical and current Oracle performance metrics, delivering insight into the performance over a period of time.

Database Monitoring


Database Management - Made Easy

Applications Manager is a Database Server monitoring tool that can help monitor a heterogeneous database server environment that may consist of Oracle database, MS SQL, Sybase, IBM DB2 and MySQL databases. It also helps database administrators (DBAs) and system administrators by notifying about potential database performance problems. For database server monitoring, Applications Manager connects to the database and ensures it is up. Applications Manager is also an agentless monitoring tool that executes database queries to collect performance statistics and send alerts, if the database performance crosses a given threshold. With out-of-the box reports, DBAs can plan inventory requirements and troubleshoot incidents quickly.

Database Server Monitoring Software Needs to
ensure high availability of database servers
keep tab on the database size, buffer cache size, database connection time
analyze the number of user connections to the databases at various time
analyze usage trends
help take actions proactively before critical incidents occur.
Applications Manager supports the monitoring of the following databases out-of-the-box:

Oracle Management

MySQL Management

Sybase Management

MS SQL Management

DB2 Management


Oracle Management

Oracle Monitoring includes efficient and complete monitoring of performance, availability, and usage statistics for Oracle databases. It also includes instant notifications of errors and corrective actions. Provides comprehensive reports and graphs. More on Oracle Management >>

MySQL Management

MySQL is the most popular open source relational database system. Applications Manager MySQL Monitoring includes managing MySQL as part of your IT infrastructure, by diagnosing performance problems in real time. More on MySQL Management >>

MS SQL Server Management

Microsoft SQL Server is the enterprise database solution used most commonly on Windows. Applications Manager manages MS SQL Server databases through native Windows performance management interfaces. This ensures optimal and complete access to all the metrics that MS SQL Server exposes. More on MS SQL Management>>

DB2 Management

DB2 Monitoring includes effective monitoring of availability and performance of DB2 Databases with ease. Applications Manager facilitates automated and on-demand monitoring tasks, which will help manage DB2 databases running at its highest levels of performance. More on DB2 Management>>

Sybase Management

Availability and Performance of Sybase ASE Database servers are monitored by Applications Manager. Performance Metrics such as memory usage, connection statistics, etc. are monitored More on Sybase Management>>

WebSphere Monitoring


Take Control of WebSphere Management


WebSphere Server is one of the leading J2EE™application servers in today’s marketplace. Applications Manager, a tool for monitoring the performance and availability of applications and servers helps in IBM WebSphere Management.


Applications Manager automatically diagnoses, notifies, and corrects performance and availability problems not only with WebSphere Servers, but also with the servers and applications in the entire IT infrastructure.

WebSphere monitoring involves delivering comprehensive fault management and proactive alert notifications, checking for impending problems, triggering appropriate actions, and gathering performance data for planning, analysis, and reporting.

Some of the components that can be monitored in WebSphere are:

JVM Memory Usage
Server Response Time
CPU Utilization
Metrics of all web applications
User Sessions and Details
Enterprise JavaBeans (EJBs)
Thread Pools
Java Database Connectivity (JDBC) Pools
Custom Application MBeans (JMX) attributes
WebSphere Management Capabilities
Out-of-the-box management of WebSphere availability and performance - checks if it is running and executing requests.
WebSphere Monitoring in Network Deployment mode is provided
Monitors performance statistics such as database connection pool, JVM memory usage, user sessions, etc. Alerts can be configured for these parameters.
Based on the thresholds configured, notifications and alerts are generated. Actions are executed automatically based on configurations.
Performance graphs and reports are available instantly. Grouping of reports, customized reports and graphs based on date is available.

IBM AIX Monitoring


Monitoring AIX Made Easy

Applications Manager monitors the performance of IBM AIX Systems. First, Applications Manager discovers each AIX machine and then monitors the CPU activity, complete memory utilization, and local and remote system statistics.


The AIX Management feature optimizes AIX system performance, delivers comprehensive management reports and ensures availability through automated event detection and correction. Applications Manager also monitors processes running in the AIX system.

Some of the components that are monitored in IBM AIX are:

CPU Utilization Monitor CPU usage - check if CPUs are running at full capacity or are they being underutilized.
Memory Utilization Avoid the problem of your windows system running out of memory. Get notified when the memory usage is high (or memory is dangerously low).
Disk I/O Stats specifies read/writes per second, transfers per second, for each device.
Disk Utilization Maintain a margin of available disk space. Get notified when the disk space falls below the margin. You can also run your own programs/scripts to clear disk clutter when thresholds are crossed.
Process Monitoring Monitor critical processes running in your system. Get notified when a particular process fails.

IBM AIX Monitoring Capabilities
Out-of-the-box management of IBM AIX availability and performance.
Monitors performance statistics such as CPU utilization, memory utilization, disk utilization, Disk I/O Stats and response time.
Mode of monitoring includes Telnet and SSH.
Monitors processes running in AIX systems.
Based on the thresholds configured, notifications and alerts are generated if the AIX system or any specified attribute within the system has problems. Actions are executed automatically based on configurations.
Performance graphs and reports are available instantly. Reports can be grouped and displayed based on availability, health, and connection time.
Delivers both historical and current AIX performance metrics, delivering insight into the performance over a period of time.
Monitors memory usage and detects top consumers of memory.
For more information, refer to IBM AIX Monitoring Online Help.

WPAR and LPAR comparison


IBM has taken a leadership role in innovation, over the past fourteen years and
has been number one in the patent technology race. Out of this has come a
plethora of new and innovative products. In 2001 IBM announced the LPAR
feature on IBM eServer pSeries and then in 2004 Advanced Power Virtualization
provided the micropartitioning feature. In 2007, IBM announces WPAR Mobility.
WPARs are not a replacement to LPARs. These two technologies are both key
components of IBM's virtualization strategy. The two technologies are
complementary, and can be used together to extend their individual values.
Providing both LPAR and WPAR technology offers a broad range of virtualization
choices to meet the ever changing needs in the IT world. Table 2-2 compares
and contrasts the benefits of the two technologies.
Table 2-2 Comparing WPAR and LPAR
Workload Partitions Logical Partitions
WPARs share OS images LPARs execute OS images
Finer-grained resource management,
per-workload
Resource management per LPAR
Capacity on demand
Security isolation Stronger security isolation
Easily shared files and applications Supports multiple OSes, Tunable to
applications
Lower administrative costs:
1 OS to manage
Easy create/destroy/configure
Integrated management tools
OS Fault isolation
Chapter 2. Understanding and Planning for WPARs 37
Draft Document for Review August 6, 2007 12:52 pm 7431CH_TECHPLANNING.fm
Figure 2-7 shows how LPAR and WPARs can be combined within the same
physical server, which also hosts the WPAR Manager and NFS server required to
support partition mobility.
Important: When considering the information in Table 2-2 you should keep in
mind the following guidelines:
In general, when compared to WPARs, LPARs will provide a greater
amount of flexibility in supporting your system virtualization strategies.
Once you have designed an optimal LPAR resourcing strategy, then within
that strategy you design your WPAR strategy to further optimize your
overall system virtualization strategy in support of AIX6 applications. See
Figure 2-7 for an example of this strategy, where multiple LPARs are
defined to support different OS and application hosting requirements, while
a subset of those LPARs running AIX6 are setup specifically to provide a
global environment for hosting WPARs.
Because LPAR provisioning is hardware/firmware based you should
consider LPARs as a more secure starting point for meeting system
isolation requirements than WPARs.
7431CH_TECHPLANNING.fm Draft Document for Review August 6, 2007 12:52 pm
38 Workload Partitions in IBM AIX Version 6.1

WPAR mobility.


Live Application Mobility is the newest virtualization technology from IBM. This is
a software approach that enhances the current line of technology. Live
Application Mobility is a complement to IBM’s line of virtualization package. The
premise is to allow for planed migrations of workloads from one system to
another whilst the application is not interrupted. This could be used for example
to perform a planned firmware installation on the server. Most workload do not
7431CH_TECHPLANNING.fm Draft Document for Review August 6, 2007 12:52 pm
32 Workload Partitions in IBM AIX Version 6.1
need to be aware of the WPAR relocation. But proper planning and testing are
always recommended before moving anything into a production environment.
WPAR mobility, also referred to as relocation, applies to both type of WPAR:
application and system. The relocation of a WPAR consists in moving its
executable code from one LPAR to another one, while keeping the application
data on the same storage devices. It is therefore mandatory that these storage
devices are accessible from both the source and target LPARs hosting the
WPAR.
In the initial version of AIX 6, this dual access to the storage area is provided
thanks to NFS. As mentioned previously, the hosting global environment hides
the physical and logical device implementations from the hosted WPARs. The
WPAR only deals with data storage at filesystem level. All files that needs to be
written by the application must be hosted on an NFS filesystem. All other files,
including the AIX operating systems files can be stored in filesystems local to the
hosting global environment. Table 2-1 helps planning the creation of the
filesystems for an application that requires WPAR mobility, when hosted in an
application or system workload partition, for an application which only writes in
filesystems dedicated to the application.
Table 2-1 Default filesystem location to enable partition mobility
Figure 2-5 on page 33 shows an example of a complete environment in which to
deploy LPARs and WPARs on two p595 systems.
The first global environment, called saturn, and is hosted in an LPAR of the first
p595. It is a client of the NFS server as well as titian, the system WPAR inside of
it. The second system is also a p595, but could be any of the same class of
systems from the p505 on up. One of its LPARs hosts a global environment
called jupiter, which is also a client of the NFS server.
Filesystem Application WPAR System WPAR
/ Global environment NFS mounted
/tmp Global environment NFS mounted
/home Global environment NFS mounted
/var Global environment NFS mounted
/usr Global environment Global environment
/opt Global environment Global environment
application specific NFS mounted NFS mounted
Chapter 2. Understanding and Planning for WPARs 33
Draft Document for Review August 6, 2007 12:52 pm 7431CH_TECHPLANNING.fm
There a utility server and for the example it is a p550. On this system there is an
NFS server, a NIM server and a WPAR Manager for AIX to provide the single
management point need for all the WPARs. The NIM server is in the picture to
represent how to load AIX images into the frame which could have a large
number of LPARs. The NFS server is for providing an outside the box filesystem
solution to the WPARs and provide the vehicle to move them on the fly from one
system to another with out disrupting the application.
Figure 2-5 Overview of the topology requirements in a mobile WPAR solution
The NFS server is a standard configuration and is utilizing either NFS protocols
version 3 or version 4. Command line editing or the use of SMIT can be used to
configure the /etc/exports.
Figure 2-6 is a representation of the relationship between the different views of
the same filesystems as seen:
from the NFS server where they are physically located,
7431CH_TECHPLANNING.fm Draft Document for Review August 6, 2007 12:52 pm
34 Workload Partitions in IBM AIX Version 6.1
from the global environments on which they are NFS-mounted, and
from the system WPAR that uses them.
In the WPAR, the /opt, /proc and /usr are setup as namefs with read-only
permissions (exception: /proc is always read-write) mapping on the global
environment /opt, /proc and /usr. The rest of the filesystems (/, /home, /tmp and
/var) are setup as standard NFS. The /etc/exports file on the NFS server must
have permissions set for both the global environment (jupiter) and system WPAR
(ganymede) for the mobility to work.
Important: The NFS server must provide access to both the global
environment and the WPAR in order for the WPAR to work at all. In a mobility
scenario, access must be provided to the WPAR and all global environments
to which the WPAR may be moved. Furthermore, any time /, /var, /usr, or /opt
are configured as NFS mounts, the NFS server must provide root access (e.g.
via the -r option to mknfsexp) to all of the relevant hostnames.
Chapter 2. Understanding and Planning for WPARs 35
Draft Document for Review August 6, 2007 12:52 pm 7431CH_TECHPLANNING.fm
Figure 2-6 Filesystems from the NFS for a Mobile System WPAR
Using the df command as in shows that the global environment jupiter has its
own filesystems hosted on locally attached disks as well as NFS filesystems
mounted from the gracyfarms NFS server, for use by the for ganymede system
WPAR.
Example 2-5 NFS server mountpoints for ganymede WPAR
root: jupiter:/wpars/ganymede --> df
Filesystem 512-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 131072 66376 50% 1858 6% /
/dev/hd2 3801088 646624 83% 32033 7% /usr
/dev/hd9var 524288 155432 71% 4933 8% /var
/dev/hd3 917504 233904 75% 476 1% /tmp
/dev/hd1 2621440 2145648 19% 263 1% /home
/proc - - - - - /proc
7431CH_TECHPLANNING.fm Draft Document for Review August 6, 2007 12:52 pm
36 Workload Partitions in IBM AIX Version 6.1
/dev/hd10opt 1572864 254888 84% 7510 4% /opt
gracyfarms:/big/ganymede/root 131072 81528 38% 1631 16% /wpars/ganymede
gracyfarms:/big/ganymede/home 131072 128312 3% 5 1% /wpars/ganymede/home
/opt 1572864 254888 84% 7510 4% /wpars/ganymede/opt
/proc - - - - - /wpars/ganymede/proc
gracyfarms:/big/ganymede/tmp 262144 256832 3% 12 1% /wpars/ganymede/tmp
/usr 3801088 646624 83% 32033 7% /wpars/ganymede/usr
gracyfarms:/big/ganymede/var 262144 229496 13% 1216 5% /wpars/ganymede/var

Application WPARs


There are two different types of workload partitions. The simplest is application
WPAR. It can be viewed as a shell which spawns an application and can be
launched from the global environment. This is a light weight application resource:
It does not provide remote login capabilities for end users. It only contain a small
number of processes, all related to the application, and uses services of the
global environment daemons and processes.
It shares the operating system filesystems with the global environment. It can be
setup to receive its application filesystem resources from disks owned by the
hosting AIX instance, or from a NFS server.
Figure 2-3 on page 28 shows the relationship of an application WPAR
filesystems to the default global environment filesystems. The filesystems that
are visible to processes executing within the application WPAR are the global
environment filesystems shown by the relationships in the figure.
If an application WPAR accesses data on an NFS mounted filesystem, this
filesystem must be mounted in the global environment directory tree. The mount
point is the same, when viewed from within the WPAR than when viewed from
the global environment. The system administrator of the NFS server must
configure the /etc/exports file so that filesystems are exported to both the global
environment IP address and to the application WPAR IP address.
Processes executing with an application WPAR can only see processes that are
executing within the same WPAR. In other words, the use of Inter Process
Communication (IPC) by application software is limited to the set of processes
within the boundary of the WPAR.
Applications WPARs are temporary objects. The life-span of an application
WPAR is the life-span of the application it hosts. An application WPAR is created
at the time the application process is instantiated. The application WPAR is
destroyed when the last process running within the application partition exits. An
application WPAR is candidate for mobility. It can be started in one LPAR, and
relocating to other LPARs during the life of its hosted application process.
7431CH_TECHPLANNING.fm Draft Document for Review August 6, 2007 12:52 pm
28 Workload Partitions in IBM AIX Version 6.1
Figure 2-3 File system relationships from the global environment to the Application WPAR
2.5 System WPARs
The second type of WPAR is a system WPAR. A system WPAR provides a typical
AIX environment for executing applications, with some restrictions. A system
WPAR has its own runtime resources. It contains an init process that can spawn
daemons. For example, it has its own inetd daemon to provide networking
services, and own System Resource Control (SRC).
Every system WPAR has its own unique set of users, groups and network
interface addresses. The users and groups defined within a system WPAR are
completely independent from the users and groups defined at the global
environment level. In particular, the root user of the WPAR only has superuser
privileges within this WPAR, and has no privilege in the global environment (In
fact, the root and other users defined in within the WPAR cannot even access the
global environment). In the case of a system partition hosting a database server,
the DB administrator can for example be given root privilege within the DB
WPARs, without giving him any global environment privilege.
The environment provided by a system WPAR to its hosted application and
processes is a chroot complete AIX environment, with access to all AIX systems
Chapter 2. Understanding and Planning for WPARs 29
Draft Document for Review August 6, 2007 12:52 pm 7431CH_TECHPLANNING.fm
files that are available in a native AIX environment. The creation of a system
WPAR includes the creation of a base directory, referred to as the base directory
in the WPAR documentation. This base directory is the root of the chroot system
WPAR environment. By default, the path to this base directory is
/wpars/ in the global environment.
By default, the base directory contains 7 filesystems:
/, /home, /tmp and /var are real filesystems, dedicated to the system partition
use.
/opt and /usr are read-only namefs mounts over the global environment’s /usr
and /opt.
the /proc pseudo-filesystem maps to the global environment /proc
pseudo-filesystem (/proc in a WPAR only makes available process
information for that WPAR).
Figure 2-4 depicts an overview of these filesystems, viewed from the global
environment and from within the system WPAR. In this example, a WPAR called
titian is hosted in an LPAR called saturn. Although the diagram shows the global
environment utilizing VIOs with two vscsi adapter along with virtual disk and
using AIX native MPIO for a highly available rootvg. The system could be setup
and supported with physical adapters and disk.
Figure 2-4 Filesystems relationship from the Global Environment to the System WPAR
7431CH_TECHPLANNING.fm Draft Document for Review August 6, 2007 12:52 pm
30 Workload Partitions in IBM AIX Version 6.1
In this figure, box with a white background symbolize real filesystems, while box
with orange backgrounds symbolize links. The gray box labeled titian shows the
pathname of the filesystems as they appear to processes executing within the
system WPAR. The grey box labeled saturn shows the pathname to the
filesystems used within the global environments, as well as the basedir mount
point below which the system WPAR partition are created.
Example 2-1 shows the /wpars created within a global environment to host base
directory of WPARs created with the global environment.
Example 2-1 Listing files in the global environment
root: saturn:/ --> ls -ald /wpars
drwx------ 5 root system 256 May 15 14:40 /wpars
root: saturn:/ -->
Then when looking inside the directory of /wpars there is now the directory of
titian as show in Example 2-2.
Example 2-2 Listing /wpars in the global environment
root: saturn:/wpars --> ls -al /wpars
drwx------ 3 root system 512 May 1 16:36 .
drwxr-xr-x 23 root system 1024 May 3 18:06 ..
drwxr-xr-x 17 root system 4096 May 3 18:01 titian
In Example 2-3, we see the mount points for the filesystem of the operating
system of titian as created from saturn to generate this system WPAR.
Example 2-3 Listing the contents of /wpars/titian in the global environment
root: epp182:/wpars/titian --> ls -al /wpars/titian
drwxr-xr-x 17 root system 4096 May 3 18:01 .
drwx------ 3 root system 512 May 1 16:36 ..
-rw------- 1 root system 654 May 3 18:18 .sh_history
drwxr-x--- 2 root audit 256 Mar 28 17:52 audit
lrwxrwxrwx 1 bin bin 8 Apr 30 21:20 bin -> /usr/bin
drwxrwxr-x 5 root system 4096 May 3 16:41 dev
drwxr-xr-x 28 root system 8192 May 2 23:26 etc
drwxr-xr-x 4 bin bin 256 Apr 30 21:20 home
lrwxrwxrwx 1 bin bin 8 Apr 30 21:20 lib -> /usr/lib
drwx------ 2 root system 256 Apr 30 21:20 lost+found
drwxr-xr-x 142 bin bin 8192 Apr 30 21:23 lpp
drwxr-xr-x 2 bin bin 256 Mar 28 17:52 mnt
drwxr-xr-x 14 root system 512 Apr 10 20:22 opt
dr-xr-xr-x 1 root system 0 May 7 14:46 proc
drwxr-xr-x 3 bin bin 256 Mar 28 17:52 sbin
drwxrwxr-x 2 root system 256 Apr 30 21:22 tftpboot
Chapter 2. Understanding and Planning for WPARs 31
Draft Document for Review August 6, 2007 12:52 pm 7431CH_TECHPLANNING.fm
drwxrwxrwt 3 bin bin 4096 May 7 14:30 tmp
lrwxrwxrwx 1 bin bin 5 Apr 30 21:20 u -> /home
lrwxrwxrwx 1 root system 21 May 2 23:26 unix -> /usr/lib/boot/unix_64
drwxr-xr-x 43 bin bin 1024 Apr 27 14:31 usr
drwxr-xr-x 24 bin bin 4096 Apr 30 21:24 var
drwxr-xr-x 2 root system 256 Apr 30 21:20 wpars
Example 2-4 shows the output of the df executed from the saturn global
environment. It shows that one system WPAR is hosted within saturn, with its
filesystems mounted under the /wpars/titian base directory. The example shows
that the /, /home/ /tmp and /var filesystems of the system WPAR are created on
logical volumes of the global environments. It also shows that the /opt and /usr
filesystems of the WPAR are namefs mounts over the global environment /opt
and /usr.
Example 2-4 Listing mounted filesystem in the global environment
root: saturn:/wpars/titan --> df
Filesystem 512-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 131072 66376 50% 1858 6% /
/dev/hd2 3801088 646624 83% 32033 7% /usr
/dev/hd9var 524288 155432 71% 4933 8% /var
/dev/hd3 917504 233904 75% 476 1% /tmp
/dev/hd1 2621440 2145648 19% 263 1% /home
/proc - - - - - /proc
/dev/hd10opt 1572864 254888 84% 7510 4% /opt
glear.austin.ibm.com:/demofs/sfs 2097152 1489272 29% 551 1% /sfs
/dev/fslv00 131072 81528 38% 1631 16% /wpars/titian
/dev/fslv01 131072 128312 3% 5 1% /wpars/titian/home
/opt 1572864 254888 84% 7510 4% /wpars/titian/opt
/proc - - - - - /wpars/titian/proc
/dev/fslv02 262144 256832 3% 12 1% /wpars/titian/tmp
/usr 3801088 646624 83% 32033 7% /wpars/titian/usr
/dev/fslv03 262144 229496 13% 1216 5% /wpars/titian/var