IBM AIX Blog: 2008

Sunday, October 5, 2008

Vmstat o/p

vmstat - Report virtual memory statistics Summary of overall system usage vmstat reports information about processes, memory, paging, block IO, traps, and cpu activity.
The first report produced gives averages since the last reboot. Addi- tional reports give information on a sampling period of length delay. The process and memory reports are instantaneous in either case.
Example : To see usage averaged over 5-second intervals - but display only 8 lines
# vmstat 5 8
procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 0 207904 5760 20524 0 0 86 30 117 47 1 2 96 0 0 0 0 207904 5776 20524 0 0 0 5 103 12 0 0 100 0 0 0 0 207904 5780 20524 0 0 0 1 108 26 0 0 100 0 0 0 0 207904 5780 20524 0 0 0 1 106 19 0 0 100 0 0 0 0 207904 5792 20524 0 0 0 5 112 33 0 0 100 0 0 0 0 207904 5796 20524 0 0 0 7 108 19 0 0 100 0 0 0 0 207904 5808 20524 0 0 0 4 108 24 0 0 100 0 0 0 0 207904 5808 20524 0 0 0 1 107 22 0 0 100
The Table show 6 categories of information on the first line and furtherdetails of each of the major fields
FIELD DESCRIPTIONS Procs - The number of processes and their types
r: The number of processes waiting for run time. b: The number of processes in uninterruptable sleep,which means they are waiting on a resource w: The number of processes swapped out but otherwise ready to run
Memory - Info about physical memory and swap space
swpd: the amount of virtual memory used (kB). free: the amount of idle [free] physical memory (kB). buff: the amount of memory used as buffers (kB). cache: virtual memory that's cached
Swap - Amount of swapping si: Amount of memory swapped in from disk (kB/s). so: Amount of memory swapped to disk (kB/s).
Note : Higher numbers here indicate too much swapping IO - Info about input and output
bi: Blocks sent to a block device (blocks/s). bo: Blocks received from a block device (blocks/s).
Note : Higher numbers here indicate too much disk activity
System - Information about the system
in: The number of interrupts per second, including the clock. cs: The number of context switches per second. i.e. the number of times the kernel changes which process is running
CPU These are percentages of total CPU time used us: % of time used by User process - user time sy: % of time used by system processes - system time id: % of time the CPU was idle - idle time
All linux blocks are currently 1k, except for CD-ROM blocks which are 2k.
See : /proc/meminfo /proc/stat

Important Notes while working on AIX

Important Notes while working in AIX

Note1- To Check Serial port connectivity (for HACMP), connect serial cable between two systems:-

On one server run
# cat < /dev/tty1

On second server run
# ls > /dev/tty1

Note2- Below is the steps / tips to keep in mind while calculating space in KB (512 or 1024):-

While calculating bytes for increase and decrease file system size, first check it is in 512KB blocks or 1024KB blocks.

Calculations-
Increase file system size in MB-
Formula- “MB to increase * 1024 = Value Multiply by 2 + Current bytes” = Total file system size in MB.
(Note- You have to multiply by two only when the existing file system is in 512KB Blocks and not requires to multiply when it is already in 1024KB Blocks)

Increase file system size in GB-
Formula- “GB to increase * 1024 * 1024 = Value Multiply by 2 + Current bytes” = Total file system size in GB.
(Note- You have to multiply by two only when the existing file system is in 512KB Blocks and not requires to multiply when it is already in 1024KB Blocks)

Note3- While restoring the mksysb in other machine, we can change the attributes in bosinst.data file, that is backed up with mksysb. Like- #vi bosinst.data

Note4- To identify the type of system hardware capability you have, either 32-bit or 64-bit, execute the bootinfo -y command. If the command returns a 32, you cannot use the 64-bit kernel.

Note5- The AIX 5L operating system previously contained both a uniprocessor and a multiprocessor 32-bit kernel. Effective with AIX 5L Version 5.3, the operating system supports only the multiprocessor kernel, regardless of the number of physical processors.

Note6- Types of VG’s and limitations.

Note7- To set any command to run by default during system boot, add the command in /etc/rc file.
For example- We can add the commands in rc file /usr/bin/quotacheck –a and /usr/bin/quotaon –a.

Note8- The default signal sending by kill command is terminate signal - “SIGTERM”, Signal no. 15. The signal names are listed in /usr/include/sys/signal.h.

Note9- Most common SIGNALS used by kill command are-
15- SIGTERM (Terminate) (Default)
9- SIGKILL (KILL)
18- SIGTSTP (STOP)

Note10- Svmon command display the current state of virtual memory in nine different parts-
1. global
2. user
3. command
4. class
5. tier
6. process
7. segment
8. detailed segment
9. frame

The flags and detailed information can be found at web site- http://publib.boulder.ibm.com/infocenter/pseries/index.jsp.

Note11- Startsrc, stopsrc and refresh command sends request to SRC to start, stop or refresh the sub-system, group of subsystems or subserver.

Note12- Zombie processes display as when listed by the ps command.

Note13- By using CTRL-C function in running command, you can cancel the whole process.

Note14- By using CTRL-Z function in running command, it will stop the process immediately.

Note15- The wildcard characters are- asterisk (*) and question mark (?).
Where, The metacharacters are- open and close square brackets ([ ]), hyphen (-), and exclamation mark (!).

Note16- When using smit menu for configuration, the wildcards meaning are-
* - Means mandatory things you have to select while using SMIT.
# - Numeric parameter.
+ - List of options available, can check with drop down menu.
/ - Full path is required.

Note17- In AIX5L AIX print subsystem is already configured. To enable System V Printing subsystem in AIXL, you have to installed the packages from AIX base CD.

Note18- Smitty installp command stores information if maintenance, removal and installation of packages in /var/adm/sw/installp.log, and some detailed information in $HOME/smit.log and $HOME/smit.script.

Note19- By default, when the instfix command is run from the command line, the command uses stdout and stderr for reporting. If you want to generate an installation report, you will need to redirect the output.
For example:
#instfix -aik IY73748 > /tmp/instfix.out 2> /tmp/instfix.err

Note20- Types of AIX Installations and difference between them-
New and complete overwrite
Preservation and
Migration

The difference is-

Note21- Default IP’s for HMC Ports on server are-
HMC Port1- 192.168.2.147
HMC Port2- 192.168.3.147
These IP’s are default for new p-series server until change.

Note22- Authentication for server HMC port is-
User – admin
Password- password
This is default until changed.

Note23- Default IP for HMC’s Ethernet port is-
eth0- 192.168.3.143

Note24- User name and password for HMC system login is-
User- hmcroot
Pass- abc123
This is default comes with system until changed.

Note25- Default IP for IBM SAN Switch management port is-
Management port- 10.77.77.77
This is default comes with all switches until changed.

Note26- Default IP’s for SAN Storage (DS4300) management ports are-
Controller A management port- 192.168.128.101
Controller B management port- 192.168.128.102
Subnet mask- 255.255.255.0
This is default comes with storage DS4300 until changes.

Note27- System booting modes are-
Normal mode
System management services (SMS)
Maintenance mode
Diagnostics

Note28- 32767 Users can connect with AIX server at single time.

Note29- First 512 bytes in hdd reserved for VGDA and Quoram.

Note30- CAPP EL4 is for SSL in AIX and TCB – Trusted computing base, it is for security reasons, we can restore some important files with tcbck commands.

Note31- Crontab command uses the following format-

minutes hours day-of-month monthly weekday “command”
0 to 59 0 to 23 1 to 31 1 to 12 0 to 6
(0 for Sunday)

For every we can use wildcard- *.

Note32- File /etc/environment is to set the basic environments for the system.

Note33- System is using following file sequence when user logged in the system:-

/etc/motd (Global, for all users)
/etc/profile (Global environments for all users)

$HOME/.profile (Single user wise environment settings)

$HOME/.hashlogin (If this file is created, message from motd will be hidden)

Note33- To clear the wall and console messages, use “esc+ctrl+l” key.

Note34- In HACMP, minimum nodes capacity is 2 and maximum is 32.

Note35- Four different types of hardware platform (Architectures) are-

RS6k: RS6000 (MCA-based uni-processor models)
RS6kSMP: RS6000 SMP (MCA-based symmetric multiprocessor models)
RSPC: ISA-bus models
CHRP: Common hardware reference platform (PCI-bus models)

Note36- Format for the date command is:-

mmddHHMMccyy, where mm-Month, dd-day, HH-Hour, MM-Minutes, and ccyy is for century and year.

Note37- Logical track group (LTG) size is the maximum allowed transfer size for an I/O
disk operation.

Note38- While working with errpt commands, these things are required to keep in mind-

Classes: General source of the error, the possible error classes are:

H Hardware.
S Software.
O Informational messages.
U Undetermined.

Type: Severity of the error that has occurred. The following types of errors are possible-
PEND The loss of availability of a device or component is imminent.

PERF The performance of the device or component has degraded to below an acceptable level.

PERM A condition that could not be recovered from. Error types with this value are usually the most severe errors and are more likely to mean that you have a defective hardware device or software module. Error types other than PERM usually do not indicate a defect, but they are recorded so that they can be analyzed by the diagnostics
programs.

TEMP A condition that was recovered from after a number of unsuccessful attempts. This error type is also used to record informational entries, such as data transfer statistics for DASD devices.

UNKN It is not possible to determine the severity of the error.

INFO The error log entry is informational and was not the result of an error.

Note39- While taking backup of rootvg or uservg, it will take only filesystem that is mounted, unmounted file systems and raw devices will not include in vg backup.

Note40- Spilliting a VG means to divide the mirrored VG in to two VG’s. We can give the new VG name in splitvg command. The pv for splitvg will show as snapshot pv. To rejoin the vg, can use the command – joinvg VGNAME.

Note41- Types of devices in UNIX are-

Block device: Block device is a structured random access device. Buffering is used to provide a block-at-a-time method of access. Usually only disk file systems.

Character (raw) device: Character (raw) device is a sequential, stream-oriented device which provides no buffering.

Tips- Most block devices also have an equivalent character device. For example, /dev/hd1 provides buffered access to a logical volume whereas /dev/rhd1 provides raw access to the same logical volume.

Tips- To identify the block and character device, we can see the difference between them with #ls –l /dev command, in the beginning of device file, it will show b letter for block device and c for character device.

Some of the commonly used block and character devices in system are-

Examples of block devices:
cd0 CD-ROM
fd0, fd0l, fd0h Diskette
hd1, lv00 Logical Volume
hdisk0 Physical Volume

Examples of character (raw) devices:
console, lft, tty0 Terminal
lp0 Printer
rmt0 Tape Drive
tok0, ent0 Adapter
kmem, mem, null Memory
rfd0, rfd0l, rfd0h Diskette
rhd1, rlv00 Logical Volume
rhdisk0 Physical Volume

Major and minor numbers: Major number refers to the software section of code in the kernel which handles that type of device, and the minor number to the particular device of that type.

Note42-
SRC: The System resource controller provides a set of commands to make it easier for the administrator to control subsystems.

Subsystem, Subserver and group of Subsystems: A subsystem is a program (or a set of related programs) designed to perform a function. This can be further divided into subservers. Some subsystem have subservers. Subservers are similar to daemons. SRC was designed to minimize the need for user intervention since it provides control of individual subsystem or groups of subsystems with a few commands.

Example: The tcp/ip group contains a subsystem, inetd, that has several subservers, for example ftp and telnet.

Note43-
VGDA: The Volume Group Descriptor Area (VGDA) is an area of disk, at least one per
PV, containing information for the entire VG. It contains administrative information about the volume group (for example, a list of all logical volume entries, a list of all the physical volume entries and so forth). There is usually one VGDA per physical volume. The exceptions are when there is a volume group of only either one or two.
In exception case, If VG contain only one hdd, there will be two VGDAs and if VG contain two hdd, then total VGDA’s will be 3, two VGDAs one disk and one VGDA on second disk.

Quorum: There must be a quorum (quorum meaning in dictionary is - minimum number of members that must be present to constitute a valid meeting) of VGDAs available to activate the volume group and make it available for use (varyonvg). A quorum of VGDA copies is needed to ensure the data integrity of management data that describes the logical and physical volumes in the volume group. A quorum is equal to 51% or more of the VGDAs available.

Tips: A system administrator can force a volume group to varyon without a quorum. This is not recommended and should only be done in an emergency.

Note44- For starting subsystems and subservers automatically while machine booting, edit file /etc/rc.tcpip and remove the hash mark from particular stanza.

Note45- Password to go in SMS menu -
Password- admin
This is default until changed.

Note46- Two types of modes available to set securities on files and directories. There are-

1. Symbolic mode
2. Numeric or absolute mode

1. Symbolic mode:
To specify a mode in symbolic form, you must specify three sets of flags.

The first set of flags specifies who is granted or denied the specified permissions,
as follows:
u File owner.
g Group and extended ACL entries pertaining to the file's group.
o All others.
a User, group, and all others. The a flag has the same effect as specifying the ugo flags together. If none of these flags are specified, the default is the a flag and the file creation mask.

(umask) is applied.

Tip: Do not separate flags with spaces.

The second set of flags specifies whether the permissions are to be removed, applied, or set:
- Removes specified permissions.
+ Applies specified permissions.
= Clears the selected permission field and sets it to the permission specified. If you do not specify a permission following =, the chmod command removes all permissions from the selected field.

The third set of flags specifies the permissions that are to be removed, applied, or set:
r Read permission.
w Write permission.
x Execute permission for files; search permission for directories.
X Execute permission for files if the current (unmodified) mode bits have at least one of the user, group, or other execute bits set. The X flag is ignored if the File parameter is specified and none of the execute bits are set in the current mode bits.

These flags set the search permissions for directories:
S Set-user-ID-on-execution permission if the u flag is specified or implied. Set-group-ID-on-execution permission if the g flag is specified or implied.
t For directories, indicates that only file owners can link or unlink files in the specified directory. For files, sets the save-text attribute.

2. Numeric or absolute mode:
The chmod command also permits you to use octal notation for the mode. The
numeric mode is the sum of one or more of the following values:

4000 Sets user ID on execution.
2000 Sets group ID on execution.
1000 Sets the link permission to directories or sets the save-text attribute for files.
0400 Permits read by owner.
0200 Permits write by owner.
0100 Permits execute or search by owner.
0040 Permits read by group.
0020 Permits write by group.
0010 Permits execute or search by group.
0004 Permits read by others.
0002 Permits write by others.
0001 Permits execute or search by others.

Note47- During system boot, fsck command will by default scan and fix if any errors found for four file systems, these are-

/
/usr
/var
/tmp.

Note48- By default devices / logical volumes are created in rootvg while installing new system with AIX-

LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 2 2 closed/syncd N/A
hd6 paging 4 8 2 open/syncd N/A
hd8 jfs2log 1 2 2 open/syncd N/A
hd4 jfs2 1 2 2 open/syncd /
hd2 jfs2 9 18 2 open/syncd /usr
hd9var jfs2 1 2 2 open/syncd /var
hd3 jfs2 1 2 2 open/syncd /tmp
hd1 jfs2 1 2 2 open/syncd /home
hd10opt jfs2 1 2 2 open/syncd /opt

Note49- To go in SMS menu while system startup, press “1” and to select factory default bootlist press “5”.

How to add a new machine into Nim

Prequiest are
The disk should not be mirrored.
You should must be knowing the IP address of the enthernet port of the server which u will be giving to the server.
We need to make one Master and others as there Clients .
Required to set the ip address of the master server and the ulimits . Step that needs to be followed at the Master end are as follows :
Step 1) Insert cd1 of the os base cd and run nim_master_setup command .
It will take time and automaically configure the required setup , it will rake nearly 5-10 mins to complete .
Step 2) Type vi /etc/host ......in this add the entire of your client machines .
ie . you need u need to give ip address and the host name .
Step 3) Once this is done then run smitty nim command
A: Select Perform Nim Administration Task .
(1) Manage Machine . * define Machine ..........in this u will need to give the hostname of ur client machine . (then press esc+3)
(2) Manage Network Install Resource Allocation .
* Allocate Network Install Resources ..........in this it will show u host name of master and clients , need to select client . .............once u have selected client machine it will show u the list of the thigs that will be install , need to select all the things (3) Perform Operations on Machine . * Select
This complete all the settings that needs to be do on the Master Server .
Step that needs to be followed at the Client end are as follows :
Step 1) Boot the client server in SMS menu select the 2 option ie Remote IPL
Step 2) Give ip address , subnet mask , host name of the client machine.
Step 3) Make sure that protocal used is normal instead of IEEE802.1 .
Step 4) The spanning tree should be not seleceted ie it should be off .
Step 5) It will give u option to test ur ping select that and check it the output will be ping sucessfull .
Step 6) Press X and logout of the SMS menu the installation will start and it will take nearly 15-20 for the installation to be complete .
This complete all the settings that needs to be do on the Client Server .
And this complete NIM installation .

How to remove a tape Drive from a lpar

To remove Tape Drive :

Step 1 )
Go to Lpar to whom its assigned
a) rmdev –dl rmt0
b) lsdev –Cl cd0 –F Parent
ide0
c) lsslot –c slot –l ide0
slot no T12 pci2 ide0
d) rmdev –l pci2 –R ( R – to remove child process too)
cd0 defined
ide0 defined
pci2 defined
or
e) rmdev –l pci2
f) rmdev –l ide0

Go to Lpar right clik on particular lpar à Dynamic logical partitionà Physical adaptor resource à remove Select the slot T12 and clik OK

A script which is used to recove a rootvg when rootvg failed

# cat rvgrecover
VG=rootvg
PV=hdisk0
cp /etc/objrepos/CuAt /etc/objrepos/CuAt.orig cp /etc/objrepos/CuDep /etc/objrepos/CuDep.orig cp /etc/objrepos/CuDv /etc/objrepos/CuDv.orig cp /etc/objrepos/CuDvDr /etc/objrepos/CuDvDr.orig
lqueryvg -Lp hdisk0 awk '{print $2}'while read LVname;
do
odmdelete -q "name=$LVname" -o CuAt
odmdelete -q "name=$LVname" -o CuDv
odmdelete -q "name=$LVname" -o CuDvDr
done
odmdelete -q "name=$VG" -o CuAt
odmdelete -q "parent=$VG" -o CuDv
odmdelete -q "name=$VG" CuDep
odmdelete -q "dependency=$VG" -o CuDep
odmdelete -q "value1=10" -o CuDvDr
odmdelete -q "value3=$VG" -o CuDvDr
importvg -y $VG $PV #Ignore lvaryoffvg errors
varyonvg $VG

Nfs configuration and Auto mount

Server Side.
We want to mount the /backup NFS directory from 10.0.128.114 to 10.0.252.88 server
# mknfsexp -d /backup -t ro -h 10.0.252.88-----------------------------------------------------------------------------------------------------------------
Client Side
# mknfsmnt -f /backup1 -d /backup -h 10.0.128.114
The above command mount the /backup to /backup1 on 10.0.252.88 server

############### Using AutoFS to automatically mount a file system #############

AutoFS relies on the use of the automount command to propagate the automatic mount configuration information to the AutoFS kernel extension and start the automountd daemon. Through this configuration propagation, the extension automatically and transparently mounts file systems whenever a file or a directory within that file system is opened. The extension informs the automountd daemon of mount and unmount requests, and the automountd daemon actually performs the requested service.
Because the name-to-location binding is dynamic within the automountd daemon, updates to a Network Information Service (NIS) map used by the automountd daemon are transparent to the user. Also, there is no need to premount shared file systems for applications that have hard-coded references to files and directories, nor is there a need to maintain records of which hosts must be mounted for particular applications.

AutoFS allows file systems to be mounted as needed. With this method of mounting directories, all file systems do not need to be mounted all of the time; only those being used are mounted.

For example, to mount an NFS directory automatically:
Verify that the NFS server has exported the directory by entering:

# showmount -e ServerName

where ServerName is the name of the NFS server. This command displays the names of the Directories currently exported from the NFS server.
Create an AutoFS master file and map file. AutoFS mounts and unmounts the directories specified in these map files.

For example, suppose you want AutoFS to mount the /local/dir1 and /local/dir2 directories as needed from the serve1 server onto the /remote/dir1 and /remote/dir2 directories, respectively. The auto_master file entry would be as follows:

/remote /tmp/mount.map
The /tmp/mount.map file entry would be as follows:

dir1 -rw serve1:/local/dir1dir2 -rw serve1:/local/dir2

Ensure that the AutoFS kernel extension is loaded and the automountd daemon is running.
This can be accomplished in two ways: Using the automount command: Issue

/usr/bin/automount -v.
Using SRC: Issue lssrc -s automountd. If the automountd subsystem is not running, issue

startsrc -s automountd.

Note: Starting the automountd daemon with the startsrc command will ignore any changes that have been made to the auto_master file.
To stop the automount daemon, issue the stopsrc -s automountd command. If, for some reason, the automountd daemon was started without the use of SRC, issue:

kill automountd_PID

where automountd_PID is the process ID of the automountd daemon. (Running the ps -e command displays the process ID of the automountd daemon.) The kill command sends a SIGTERM signal to the automountd daemon

How to set up a quota in AIX

######### Procedure to set up the disk quota ############################
To set up the disk quota system, use the following procedure:
1. Log in with root authority.
2. Determine which file systems require quotas.
3. Use the chfs command to include the userquota and groupquota quota configuration attributes in the /etc/filesystems file.
The following example uses the chfs command to enable user quotas on the /home file system:
# chfs -a "quota = userquota" /home
To enable both user and group quotas on the /home file system, type:
# chfs -a "quota = userquota,groupquota" /home
The corresponding entry in the /etc/filesystems file is displayed as follows:
/home: dev = /dev/hd1 vfs = jfs log = /dev/hd8 mount = true check = true quota = userquota,groupquota options = rw
4. The following example uses the chfs command to establish user and group quotas for the /home file system and names the myquota.user and myquota.group quota files:
# chfs -a "userquota = /home/myquota.user" -a "groupquota = /home/myquota.group" /home
# The following example entry shows quota limits for the gpsilva user:
Quotas for user gpsilva:
/home: blocks in use: 30, limits (soft = 100, hard = 150) inodes in use: 73, limits (soft = 200, hard = 250)
This user has used 30 KB of the maximum 100 KB of disk space. Of the maximum 200 files, gpsilva has created 73. This user has buffers of 50 KB of disk space and 50 files that can be allocated to temporary storage.
5. To duplicate the quotas established for user gpsilva for user tneiva, type:
# edquota -p gpsilva tneiva
6. Enable the quota system with the quotaon command. The quotaon command enables quotas for a specified file system or for all file systems with quotas (as indicated in the /etc/filesystems file) when used with the -a flag.
7. Use the quotacheck command to check the consistency of the quota files against actual disk usage.
# very IMP
To enable this check and to turn on quotas during system startup, add the following lines at the end of the /etc/rc file:
echo " Enabling filesystem quotas "
/usr/sbin/quotacheck -a /usr/sbin/quotaon -a

########## Some examples #################
There are related commands, namely the edquota command, quotacheckcommand, and repquota command.
The following examples show the commands in typical uses:
1. To enable user quotas for the /usr/Tivoli/tsm/server/db file system,
# quotaon -u /usr/Tivoli/tsm/server/db
2. To disable user and group quotas for all file systems in the /etc/filesystems and print a message, enter:
# quotaoff -v -a
3. To display your quotas as user neivac, type:
$ quota
The system displays the following information:
User quotas for user neivac (uid 502):
Filesystem blocks quota limit grace Files quota limit grace
/u 20 55 60 20 60 65
4. To display quotas as the root user for user gpsilva, type:
quota -u gpsilva
The system displays the following information:
User quotas for user gpsilva (uid 2702):
Filesystem blocks quota limit grace files quota limit grace
/u 48 50 60 7 60 60

# To disable the quota use the command
# quotaoff -a ===> it disables the quota for all file systems.
# quotaoff -u username ===> it disables the quota for the users.
# Quotaoff -g groupname ===> it disables the quota for the groups.

File Systems IN AIX

# File Systems Types:
JFS, EJFS, NFS, CD-ROM File systems
# File Systems Structure:
1. Superblock ==> it contains control information about file systems such as a) Size of the file systems b) Name of the file systems c) The System log device d) The version number e) The number of inodes f) List of free inodes and data-bocks g) Date and time of creation of file system and also file system state.
IMP: Corruption of data may render the file system unusable. This is wny system keeps a second copy of superblock on logical block 31.

2. Allocation Group ==> it consists of inodes and it corresponding data blocks. An allocation group spans multiple adjacent disk blocks and it improves the speed of i/o operation. Booth jfs and jfs2 file systems use allocation group.

3. Inodes ===> it contains control information about file such as a) Type, Size, Owner, and the date and time when the file was created, modified, last accessed. b) it also contains the pointers to data blocks that store the actula data of file. Every File has a corresponding inode. c) The jfs restricts all file systems to 16 MB inodes.
4. Data Blocks ==> it stores the actucal data of the file or pointers to other data blocks. Default value for disk block size is 4 kb.
5. Fragments ===> Fragments of logical blocks can be used to support files smaller than the standard size of the logical boock ( 4 kb ). This rules applies only to last block of a file smaller than 32 kb.

############### File Systems Differences ########################################

Function JFS JFS2
Architectural maximum file system size 1 TB 4 PB
Architectural maximum file size 64 GB 4 PB
No of inodes Fixed, set at Dynamic system creation
Inode size 128 bytes 512 bytes
Fragment size 512 512
Block size 4096 4096
Directory organization Linear B-tree
Compression Yes No
Default ownership at creation sys.sys root.system
SGID of default file mode SGID=on SGID=off
Quotas Yes Yes

####################### Example of file system creation #####################################################
Creating file systems without specifying logical volumes
# lsvg -l testvg
testvg:LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
# crfs -v jfs -g testvg -a size=10M -m /fs1
Based on the parameters chosen, the new /fs1 JFS file systemis limited to a maximum size of 134217728 (512 byte blocks)New File System size is 262144
# crfs -v jfs2 -g testvg -a size=10M -p ro -m /fs2
File system created successfully.130864 kilobytes total disk space.New File System size is 262144

# lsvg -l testvg
testvg:LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINTloglv00 jfslog 1 1 1 closed/syncd N/Alv00 jfs 1 1 1 closed/syncd /fs1loglv01 jfs2log 1 1 1 closed/syncd N/Afslv00 jfs2 1 1 1 closed/syncd /fs2
# lslv lv00
LOGICAL VOLUME: lv00 VOLUME GROUP: testvgLV IDENTIFIER: 00c478de00004c0000000107d96de510.2 PERMISSION:read/writeVG STATE: active/complete LV STATE: closed/syncdTYPE: jfs WRITE VERIFY: offMAX LPs: 512 PP SIZE: 128 megabyte(s)COPIES: 1 SCHED POLICY: parallelLPs: 1 PPs: 1STALE PPs: 0 BB POLICY: relocatableINTER-POLICY: minimum RELOCATABLE: yesINTRA-POLICY: middle UPPER BOUND: 32MOUNT POINT: /fs1 LABEL: /fs1MIRROR WRITE CONSISTENCY: on/ACTIVEEACH LP COPY ON A SEPARATE PV ?: yesSerialize IO ?: NO
# lslv fslv00
LOGICAL VOLUME: fslv00 VOLUME GROUP: testvgLV IDENTIFIER: 00c478de00004c0000000107d96de510.4 PERMISSION:read/writeVG STATE: active/complete LV STATE: closed/syncdTYPE: jfs2 WRITE VERIFY: offMAX LPs: 512 PP SIZE: 128 megabyte(s)COPIES: 1 SCHED POLICY: parallelLPs: 1 PPs: 1STALE PPs: 0 BB POLICY: relocatableINTER-POLICY: minimum RELOCATABLE: yesINTRA-POLICY: middle UPPER BOUND: 32MOUNT POINT: /fs2 LABEL: /fs2MIRROR WRITE CONSISTENCY: on/ACTIVEEACH LP COPY ON A SEPARATE PV ?: yesSerialize IO ?: NO
# cat /etc/filesystemsgrep -ip fs1
/fs1:dev = /dev/lv00 vfs = jfslog = /dev/loglv00mount = falseaccount = false

# Mount -a ===> mount all the file systems

# mount ===> Disply mounted file systems
# mount node mounted mounted over vfs date options-------- --------------- --------------- ------ ------------ ---------------/dev/hd4 / jfs2 Nov 27 12:36 rw,log=/dev/hd8/dev/hd2 /usr jfs2 Nov 27 12:36 rw,log=/dev/hd8/dev/hd9var /var jfs2 Nov 27 12:36 rw,log=/dev/hd8/dev/hd3 /tmp jfs2 Nov 27 12:36 rw,log=/dev/hd8/dev/hd1 /home jfs2 Nov 27 12:36 rw,log=/dev/hd8/proc /proc procfs Nov 27 12:36 rw/dev/hd10opt /opt jfs2 Nov 27 12:36 rw,log=/dev/hd8/dev/testlv /test jfs2 Nov 28 19:54

# lsfs ===> Shows the characteristics of a file systems
# rmfs ===> removes the file systems
# lsvg -l testvg
testvg:LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINTloglv00 jfslog 1 1 1 closed/syncd N/Alv00 jfs 1 1 1 closed/syncd /fs1loglv01 jfs2log 1 1 1 open/syncd N/Afslv00 jfs2 1 1 1 closed/syncd /fs2testlv jfs2 1 1 1 open/syncd /test

########################### Removing the file Systems ###############################################################
# rmfs /test
rmfs: 0506-921 /test is currently mounted.
# umount /test
# rmfs /test
rmlv: Logical volume testlv is removed.
# lsvg -l testvg
testvg:LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINTloglv00 jfslog 1 1 1 closed/syncd N/Alv00 jfs 1 1 1 closed/syncd /fs1loglv01 jfs2log 1 1 1 closed/syncd N/Afslv00 jfs2 1 1 1 closed/syncd /fs2
# cat /etc/filesystemsgrep test#

###### Changing the attributes of file systems ####################################
# chfs -a size=250M -p rw /fs2
Filesystem size changed to 524288

# fsck ===> Checks the Systems consistency and interactively repairs the file systems. Always run the fsck command on the mounted file systems.

################ if 1st superblock corrupted then how to recover #############################################
If you receive one of the following errors from the fsck or mount commands, theproblem may be a corrupted superblock:
fsck: Not an AIX3 file systemfsck: Not an AIXV3 file systemfsck: Not an AIX4 file systemfsck: Not an AIXV4 file systemfsck: Not a recognized file system typemount: invalid argument
The problem can be resolved by restoring the backup of the superblock over theprimary superblock using one of the following commands:
# dd count=1 bs=4k skip=31 seek=1 if=/dev/lv00 of=/dev/lv00

############ Not able to umount the file systems ###############################
# Files are open within a file system. Close these files before the file system can be unmounted. The fuser command is often the best way to determine the process IDs for all processes that have open references within a specified file system. The process having an open reference can be killed by using the kill command and the unmount can be accomplished.
# If the file system is still busy and not getting unmounted, this could be due to a kernel extension that is loaded, but exists within the source file system. The fuser command will not show these kinds of references, because a user process is not involved. However, the genkex command will report on all loaded kernel extensions.

# find /home -type d -exec fuser -u {} \;/home:/home/lost+found:/home/guest:/home/kenzie: 3548c(kenzie)

ssh without password

This Procedure for do the ssh for without asking password only for One server
a@A:~> ssh-keygen -t rsa
Now use ssh to create a directory ~/.ssh as user b on B. (The directory may already exist, which is fine):a@A:~> ssh b@localhost mkdir -p .sshb@localhost's password:
Finally append a's new public key to b@B:.ssh/authorized_keys and enter b's password one last time:a@A:~> cat .ssh/id_rsa.pub ssh b@B 'cat >> .ssh/authorized_keys'b@B's password:
From now on you can log into B as b from A as a without password:a@A:~> ssh b@B hostnameB
=====================================================================================

Login in ServerA
Go to Home dircetory of user.
scp /home/oracle/.ssh/id_rsa.pub ServerB:/home/oracle/.ssh/authorized_keys
Then you will able to login without asking password to SerevrB.
You can login ServerB
run the command
ssh-keygen -t rsa
scp /home/oracle/.ssh/id_rsa.pub ServerA:/home/oracle/.ssh/authorized_keys
Then you will able to login without asking password to SerevrA

Changing the login screen welcome message

To prevent displaying certain information on login screens, edit the herald parameter in the
/etc/security/login.cfg file. The default herald contains the welcome message that displays with your login prompt. To change this parameter, you can either use the chsec command or edit the file directly.
The following example uses the chsec command to change the default herald parameter:
# chsec -f /etc/security/login.cfg -a default -herald "Unauthorized use of this system is Prohibited.\n\nlogin: "

To edit the file directly, open the /etc/security/login.cfg file and update the herald parameter as follows:
default:
herald ="Unauthorized use of this system is prohibited\n\nlogin:"
sak_enable = false
logintimes =
logindisable = 0
logininterval = 0
loginreenable = 0
logindelay = 0

Securing unattended Terminals

Always lock your terminal when it is not being attended to prevent unauthorized access. Leaving system terminals unsecure poses a potential security hazard. To lock your terminal, use the lock command.

Changing the CDE login screen

This security issue also affects the Common Desktop Environment (CDE) users. The CDE login screen also displays, by default, the host name and the operating system version. To prevent this information frombeing displayed, edit the /usr/dt/config/$LANG/Xresources file, where $LANG refers to the local language installed on your machine.

In the Example assuming that $LANG is set to C, copy this file into /etc/dt/config/C/Xresources. Next, open the /usr/dt/config/C/Xresources file and edit it to remove welcome messages that include the host name and operating system version.

REMOVING UNWANTED FILES IN AIX

Removing Obsolete Files

Occasionally, you need to remove unwanted and unneeded files from your system. AIX provides you with the skulker command, which allows you to automatically track and remove obsolete files. This facilityworks on candidate files located in the /tmp directory, executable a.out files, core files, and ed.hup files.

To run the skulker command, type
# skulker -p

You can automate the skulker command by setting up the cron facility to perform this task regularly.

Regaining root's password

1) Boot from a cd-rom or a bootable Tape.

2) Press F5 or 5.

3) Select option 3 from installation and maintanance menu: Start maintanance menu for system recovery.

4) Follow the option to activate the root vg and obtain the shell.

5) Once a shell is available then run the passwd command to reset the password for root.

6) sync

7) Reboot the system.

Saturday, July 19, 2008

Dynamically move the resource from one lpar to another

To remove Tape Drive :

Step 1 )
Go to Lpar to whom its assigned
rmdev –dl rmt0
lsdev –Cl cd0 –F Parent
ide0
lsslot –c slot –l ide0
slot no T12 pci2 ide0
rmdev –l pci2 –R ( R – to remove child process too)
cd0 defined
ide0 defined
pci2 defined
or
rmdev –l pci2
rmdev –l ide0

Go to Lpar right clik on particular lpar à Dynamic logical partà Physical adaptor resource à remove
Select the slot T12 and clik OK

NIM Installation Steps

Prequiest are

The disk should not be mirrored.

You should must be knowing the IP address of the enthernet port of the server which u will be giving to the server.

We need to make one Master and others as there Clients .

Required to set the ip address of the master server and the ulimits .

Step that needs to be followed at the Master end are as follows :

Step 1) Insert cd1 of the os base cd and run nim_master_setup command .

It will take time and automaically configure the required setup , it will rake nearly 5-10 mins to complete .

Step 2) Type vi /etc/host ......in this add the entire of your client machines .

ie . you need u need to give ip address and the host name .

Step 3) Once this is done then run smitty nim command

A: Select Perform Nim Administration Task .

(1) Manage Machine .

* define Machine ..........in this u will need to give the hostname of ur client machine . (then press esc+3)

(2) Manage Network Install Resource Allocation .

* Allocate Network Install Resources ..........in this it will show u host name of master and clients , need to select client .
.............once u have selected client machine it will show u the list of the thigs that will be install , need to select all the things

(3) Perform Operations on Machine .
* Select

This complete all the settings that needs to be do on the Master Server .

Step that needs to be followed at the Client end are as follows :

Step 1) Boot the client server in SMS menu select the 2 option ie Remote IPL

Step 2) Give ip address , subnet mask , host name of the client machine.

Step 3) Make sure that protocal used is normal instead of IEEE802.1 .

Step 4) The spanning tree should be not seleceted ie it should be off .

Step 5) It will give u option to test ur ping select that and check it the output will be ping sucessfull .

Step 6) Press X and logout of the SMS menu the installation will start and it will take nearly 15-20 for the installation to be complete .

This complete all the settings that needs to be do on the Client Server .

And this complete NIM installation .

AIX Boot Process.

First it check Hardware POST inventory.
Then ROS IPL checks user bootlist available devices. If a valid boot device not found then it boots from First boot device. ROS (Read Only Storage) this is a sample code which locate and load bootstrap code. It contains boot info.
Then S/W ROS locates , loads boot info and takes over control to BLV(Boot logical Volume)
Then control passed to kernel to begins to system initialization.
The system tested Hardware found in BLV & start the init process from /usr/lib/boot

With the Help of Init executes Rc.boot1 script

Init Start from Rc.boot 1.
Restbase command is called for copy the partial ODM into /dev/hd4 file system
Then cfgmgr –f called for config rules for ODM
Bootinfo –b cmd call for last boot device detection
So Phase 1 configuration methods results in the configuration of base devices into system.
Rc.boot 2 script call
Root VG will be activated with varyonvg cmd.
The /dev/hd4 file system checked and mount on / and also check for /dev/hd1 and mount on /usr
The /dev/hd9var file system checked and mount on /var. At same time copycore cmd check if dump occurred. If dump exists it will be copied from dump /dev/hd6 to default dir. /var/adm/ras then /var will be unmounted
Then swap or paging space will be turn on with the swapon cmd.
Mergedev cmd call & copied all /dev/ files from RAMFS to disk
As well as all customized ODM files from RAMFS are coped to disk.
Then /var mounted
At the end of rc.boot2 kernel removes from RAMFS.

Then init process is started Rc.boot3 script for remaining device configuration.
Then /dev/hd3 file system checked and mount on /tmp
RootVg will be synchronize with syncvg cmd.
Console is configured with cfgon cmd
Then savebase cmd saves boot customization details to be used on next boot.
Errdaemon and Syncd daemon will be started after 60 sec.
Remove /etc/nologin file
Then /etc/inittab file wiil be executes for start other process.

Technology level Upgrade Steps for AIX

Step 1) Download the appropriate TL from the below link.
http://www-912.ibm.com/eserver/support/fixes/

Step 2) Oslevel –r to check the current version of the AIX.

Output will be like below.
bash-3.00# oslevel -r
5200-10

Step 3) lppchk –v to verify that all file sets have all required requisites and are completely installed.

Smit commit ( cOMMIT All the softwares which are installed on the machine.

Step4) smitty update_all To install the ML.

Step 5) oslevel –rl 5200-15 ? to determine which file sets are missing

Output will be like below.

bash-3.00# oslevel -rl 5100-09
Fileset Actual Level Recommended ML
-----------------------------------------------------------------------------
devices.scsi.disk.diag.com 5.1.0.50 5.1.0.55

The output from this command told me I was missing the following fileset

devices.scsi.disk.diag.com

Step 6) Determine the APAR for your missing filesets

This fileset is a part which APAR? The APAR number was determined by using

instfix -icq | grep devices.scsi.disk.diag.com

Output will be below.

-bash-3.00# instfix -icq | grep devices.scsi.disk.diag.com
5.1.0.0_AIX_ML:devices.scsi.disk.diag.com:5.1.0.0:5.1.0.50:+:AIX 5.1.0.0 Release
5100-01_AIX_ML:devices.scsi.disk.diag.com:5.1.0.10:5.1.0.50:+:AIX 5100-01 Update
IY22854:devices.scsi.disk.diag.com:5.1.0.15:5.1.0.50:+:Required updates for eServer pSeries p690
5100-02_AIX_ML:devices.scsi.disk.diag.com:5.1.0.25:5.1.0.50:+:AIX 5100-02 Update
5100-03_AIX_ML:devices.scsi.disk.diag.com:5.1.0.35:5.1.0.50:+:AIX 5100-03 Update
5100-04_AIX_ML:devices.scsi.disk.diag.com:5.1.0.50:5.1.0.50:=:AIX 5100-04 Update
IY37867:devices.scsi.disk.diag.com:5.1.0.50:5.1.0.50:=:Add Diagnostic Support for a Slimline IDE DVDROM
IY33586:devices.scsi.disk.diag.com:5.1.0.35:5.1.0.50:+:The DVDROM download microcode does not display usage error.
IY33489:devices.scsi.disk.diag.com:5.1.0.35:5.1.0.50:+:Microcode Download Support for DVDROM Drives
IY32871:devices.scsi.disk.diag.com:5.1.0.35:5.1.0.50:+:Need to export extract_vpd_kw
IY31162:devices.scsi.disk.diag.com:5.1.0.35:5.1.0.50:+:lscfg -vp causes coredump
IY29658:devices.scsi.disk.diag.com:5.1.0.35:5.1.0.50:+:lscfg -vp core dump
IY27849:devices.scsi.disk.diag.com:5.1.0.25:5.1.0.50:+:Serial disk format & certify fail
IY23123:devices.scsi.disk.diag.com:5.1.0.15:5.1.0.50:+:Periodic Diagnostics: Default should be to only test proces
IY21615:devices.scsi.disk.diag.com:5.1.0.15:5.1.0.50:+:No service action displayed for certify failures
IY21326:devices.scsi.disk.diag.com:5.1.0.15:5.1.0.50:+:Gramatical error in message file
IY18557:devices.scsi.disk.diag.com:5.1.0.10:5.1.0.50:+:Drive unconfigured errors and cosmetic fixes
5100-05_AIX_ML:devices.scsi.disk.diag.com:5.1.0.51:5.1.0.50:-:AIX 5100-05 Update
5100-06_AIX_ML:devices.scsi.disk.diag.com:5.1.0.54:5.1.0.50:-:AIX 5100-06 Update
5100-07_AIX_ML:devices.scsi.disk.diag.com:5.1.0.55:5.1.0.50:-:AIX 5100-07 Update
5100-08_AIX_ML:devices.scsi.disk.diag.com:5.1.0.55:5.1.0.50:-:AIX 5100-08 Update
5100-09_AIX_ML:devices.scsi.disk.diag.com:5.1.0.55:5.1.0.50:-:AIX 5100-09 Update
IY47320:devices.scsi.disk.diag.com:5.1.0.51:5.1.0.50:-:New function
IY48980:devices.scsi.disk.diag.com:5.1.0.52:5.1.0.50:-:Cannot certify a disk of type sispdisk using diagnostics
IY54381:devices.scsi.disk.diag.com:5.1.0.54:5.1.0.50:-:new function
IY46285:devices.scsi.disk.diag.com:5.1.0.51:5.1.0.50:-:Support for new diagnostic function.
IY47416:devices.scsi.disk.diag.com:5.1.0.51:5.1.0.50:-:new function
-bash-3.00#

The output is broken down into six colon separated tokens. They breakdown as follows:

keyword name:fileset name:required level:installed level:status:abstract

Step 7) Use instfix -i |grep ML to check in which ML the filesets are lagging as below.

-bash-3.00# instfix -i |grep ML
All filesets for 5.1.0.0_AIX_ML were found.
All filesets for 5100-01_AIX_ML were found.
All filesets for 5100-02_AIX_ML were found.
All filesets for 5100-03_AIX_ML were found.
All filesets for 5100-04_AIX_ML were found.
Not all filesets for 5100-05_AIX_ML were found.
Not all filesets for 5100-06_AIX_ML were found.
Not all filesets for 5100-07_AIX_ML were found.
Not all filesets for 5100-08_AIX_ML were found.
Not all filesets for 5100-09_AIX_ML were found.
-bash-3.00# instfix -ciqk 5100-05_AIX_ML | grep :-:

Step 8)use instfix to determine the missing filesets as shown in the following example

-bash-3.00# instfix -ciqk 5100-05_AIX_ML | grep :-:
5100-05_AIX_ML:devices.scsi.disk.diag.com:5.1.0.51:5.1.0.50:-:AIX 5100-05 Update
-bash-3.00#
-bash-3.00# instfix -ciqk 5100-05_AIX_ML | grep :-:
5100-05_AIX_ML:devices.scsi.disk.diag.com:5.1.0.51:5.1.0.50:-:AIX 5100-05 Update
-bash-3.00#
-bash-3.00# instfix -ciqk 5100-06_AIX_ML | grep :-:
5100-06_AIX_ML:devices.scsi.disk.diag.com:5.1.0.54:5.1.0.50:-:AIX 5100-06 Update
-bash-3.00#
-bash-3.00# instfix -ciqk 5100-07_AIX_ML | grep :-:
5100-07_AIX_ML:devices.scsi.disk.diag.com:5.1.0.55:5.1.0.50:-:AIX 5100-07 Update
-bash-3.00# instfix -ciqk 5100-08_AIX_ML | grep :-:
5100-08_AIX_ML:devices.scsi.disk.diag.com:5.1.0.55:5.1.0.50:-:AIX 5100-08 Update
-bash-3.00# instfix -ciqk 5100-09_AIX_ML | grep :-:
5100-09_AIX_ML:devices.scsi.disk.diag.com:5.1.0.55:5.1.0.50:-:AIX 5100-09 Update

The six fields above (: delimited) indicate:
The OS/ML identifier, the fileset name, the REQUIRED fileset level to attain the ML, the INSTALLED fileset level,
the fileset status and the source abstract

Step 9) bosboot -ad hdiskn ( where n is the harddisk where rootvg resids)

step 10) reboot the machine.

LVM Interview Question

To change the file system size of the /test Journaled File System, enter:

chfs -a size=24576 /test
This command changes the size of the /test Journaled File System to 24576 512-byte blocks, or 12MB (provided it was previously no larger than this).

To increase the size of the /test Journaled File System, enter:

chfs -a size=+8192 /test
This command increases the size of the /test Journaled File System by 8192 512-byte blocks, or 4 MB.

To convert a JFS2 file system to a version which can support NFS4 ACLs, type:

chfs -a ea=v2 /test
To change the mount point of a file system, enter:

chfs -m /test2 /test
This command changes the mount point of a file system from /test to /test2.

To delete the accounting attribute from a file system, enter:

chfs -d account /home
This command removes the accounting attribute from the /home file system. The accounting attribute is deleted from the /home: stanza of the /etc/filesystems file.

To split off a copy of a mirrored file system and mount it read-only for use as an online backup, enter:

chfs -a splitcopy=/backup -a copy=2 /testfs
This mount a read-only copy of /testfs at /backup.

To change the file system size of the /test Journaled File System, enter:

chfs -a size=64M /test
This command changes the size of the /test Journaled File System to 64MB (provided it was previously no larger than this).

To reduce the size of the /test JFS2 file system, enter:

chfs -a size=-16M /test
This command reduces the size of the /test JFS2 file system by 16MB.

To freeze a file system, enter:

chfs -a freeze=60 /adl
This command freezes the /adl file system for a maximum of 60 seconds.

To thaw a file system, enter:

chfs -a freeze=off /zml

Monday, June 30, 2008

Creating LPAR from command line from HMC

Create new LPAR using command line

mksyscfg -r lpar -m MACHINE -i name=LPARNAME, profile_name=normal, lpar_env=aixlinux, shared_proc_pool_util_auth=1,min_mem=512, desired_mem=2048, max_mem=4096, proc_mode=shared, min_proc_units=0.2, desired_proc_units=0.5,max_proc_units=2.0, min_procs=1, desired_procs=2, max_procs=2, sharing_mode=uncap, uncap_weight=128,boot_mode=norm, conn_monitoring=1, shared_proc_pool_util_auth=1

Note :- Use man mksyscfg command for all flag information.

Onother method of creating LPAR through configuration file we need to create more than one lPAR at same time

Here is an example for 2 LPARs, each definition starting at new line:

name=LPAR1,profile_name=normal,lpar_env=aixlinux,all_resources=0,min_mem=1024,desired_mem=9216,max_mem=9216,proc_mode=shared,min_proc_units=0.3,desired_proc_units=1.0,max_proc_units=3.0,min_procs=1,desired_procs=3,max_procs=3,sharing_mode=uncap,uncap_weight=128,lpar_io_pool_ids=none,max_virtual_slots=10,"virtual_scsi_adapters=6/client/4/vio1a/11/1,7/client/9/vio2a/11/1","virtual_eth_adapters=4/0/3//0/1,5/0/4//0/1",boot_mode=norm,conn_monitoring=1,auto_start=0,power_ctrl_lpar_ids=none,work_group_id=none,shared_proc_pool_util_auth=1
name=LPAR2,profile_name=normal,lpar_env=aixlinux,all_resources=0,min_mem=1024,desired_mem=9216,max_mem=9216,proc_mode=shared,min_proc_units=0.3,desired_proc_units=1.0,max_proc_units=3.0,min_procs=1,desired_procs=3,max_procs=3,sharing_mode=uncap,uncap_weight=128,lpar_io_pool_ids=none,max_virtual_slots=10,"virtual_scsi_adapters=6/client/4/vio1a/12/1,7/client/9/vio2a/12/1","virtual_eth_adapters=4/0/3//0/1,5/0/4//0/1",boot_mode=norm,conn_monitoring=1,auto_start=0,power_ctrl_lpar_ids=none,work_group_id=none,shared_proc_pool_util_auth=1

Copy this file to HMC and run:

mksyscfg -r lpar -m SERVERNAME -f /tmp/profiles.txt

where profiles.txt contains all LPAR informations as mentioned above.

To change setting of your Lpar use chsyscfg command as mentioned below.

Virtual scsi creation & Mapping Slots
#chsyscfg -m Server-9117-MMA-SNXXXXX -r prof -i 'name=server_name,lpar_id=xx,"virtual_scsi_adapters=301/client/4/vio01_server/301/0,303/client/4/vio02/303/0,305/client/4/vio01_server/305/0,307/client/4/vio02_server/307/0"'

IN Above mentioned command we are creating Virtual scsi adapter for client LPAR & doing Slot mapping with VIO servers. In above scenario there is two VIO servers for redundancy.

Slot Mapping

Vio01_server ( VSCSI server slot) Client ( Vscsi client Slot)
Slot 301 Slot 301
Slot 303 Slot 303

VIO02_server (VSCSI sever Slot) Client ( VSCSI client Slot)
Slot 305 Slot 305
Slot 307 Slot 307

These Slot are mapped in such a way if Any disk or logical volume are mapped to Virtuals scsi adapter through VIO command "mkvdev".

Syntax for Virtual scsi adapter

virtual-slot-number/client-or-server/supports-HMC/remote-lpar-ID/remote-lpar-name/remote-slot-number/is-required

As in command above mentioned command mksyscfg "virtual_scsi_adapters=301/client/4/vio01_server/301/0"

means

301 - virtual-slot-number
client-or-server - client (Aix_client)
4 -- Partiotion Id ov VIO_01 server (remote-lpar-ID)
vio01_server - remote-lpar-name
301 -- remote-slot-number (VIO server_slot means virtual server scsi slot)
1 -- Required slot in LPAR ( It cannot be removed from DLPAR operations )
0 --means desired ( it can be removed by DLPAR operations)

To add Virtual ethernet adapter & slot mapping for above created profile

#chsyscfg -m Server-9117-MMA-SNxxxxx -r prof -i 'name=server_name,lpar_id=xx,"virtual_eth_adapters=596/1/596//0/1,506/1/506//0/1,"'

Syntax for Virtual ethernet adapter

slot_number/is_ieee/port_vlan_id/"additional_vlan_id,additional_vlan_id"/is_trunk(number=priority)/is_required

means

So the adapter with this setting 596/1/596//0/1 would say it is in slot_number 596, Its is ieee, the port_vlan_id is 1, it has no VLAN id assigned, It is not a trunk adapter and it is required.

Listing LPAR information from HMC command line interface

To list managed system (CEC) managed by HMC

# lssyscfg -r sys -F name

To list number of LPAR defined on the Managed system (CEC)

# lssyscfg -m SYSTEM(CEC) -r lpar -F name,lpar_id,state

To list LPAR created in your system use lsyscfg command as mentioned below.

# lssyscfg -r prof -m SYSTEM(CEC) --filter "lpar_ids=X, profiles_names=normal"

Flags

m-> Managed System name
lpar_ids -> Lpar ID (numeric Id for each LPAR created in the Managed system (CEC)
profile_name -> To choose profile of LPAR

To start console of LPAR from HMC

# mkvterm -m SYSTEM(CEC) --id X

m- > managed system (ex -p5-570_xyz)
id - > LPAR ID

To finish a VTERM, simply press ~ followed by a dot .!

To disconnect console of LPAR from HMC

# rmvterm -m SYSTEM(CEC) --id x

To access LPAR console for diffrent Managed system from HMC

#vtmenu

Activating Partition

hscroot@hmc-570:~> lssyscfg -m Server-9110-510-SN100129A -r lpar -F name,lpar_id,state,default_profile VIOS1.3-FP8.0,1,Running,default linux_test,2,Not Activated,client_default hscroot@hmc-570:~> chsysstate -m Server-9110-510-SN100129A -r lpar -o on -b norm --id 2 -f client_default

The above example would boot the partition in normal mode. To boot it into SMS menu use -b sms and to boot it to the OpenFirmware prompt use -b of.

To restart a partition the chsysstate command would look like this:

hscroot@hmc-570:~> chsysstate -m Server-9110-510-SN100129A -r lpar --id 2 -o shutdown --immed --restart

And to turn it off - if anything else fails - use this:
hscroot@hmc-570:~> chsysstate -m Server-9110-510-SN100129A -r lpar --id 2 -o shutdown --immed
hscroot@hmc-570:~> lssyscfg -m Server-9110-510-SN100129A -r lpar -F name,lpar_id,state
VIOS1.3-FP8.0,1,Running
linux_test,2,Shutting Down

Deleting Partition

hscroot@hmc-570:~> lssyscfg -m Server-9110-510-SN100129A -r lpar -F name,lpar_id
VIOS1.3-FP8.0,1
linux_test,2
hscroot@hmc-570:~> rmsyscfg -m Server-9110-510-SN100129A -r lpar --id 2
hscroot@hmc-570:~> lssyscfg -m Server-9110-510-SN100129A -r lpar -F name,lpar_id
VIOS1.3-FP8.0,1

Enabling the Advanced POWER Virtualization Feature

Enabling the Advanced POWER Virtualization Feature

Before we could use the virtual I/O, we had to determine whether the machine was enabled to use the feature. To do this, we right-clicked on the name of the target server in the HMC’s ‘Server and Partition’ view and looked at that server’s properties. Figure 4 shows it did not have the feature enabled.

Users can enable this feature by obtaining a key code from their IBM sales representative using information that the HMC gathers about their machine when the user navigates to Show Code Information in the HMC. Figure 5 shows how to navigate there as well as how to get to the HMC dialog box used to enter the activation code which renders the system VIO-capable. We obtained an access code and entered it in the dialog box in Figure.

VIO server setup example

Virtual I/O Example
A user who currently runs applications on a POWER4 system may want to upgrade to a POWER5 system running AIX 5.3 in order to take advantage of virtual I/O. If so, do these three things:
y Create a Virtual I/O Server. y Add virtual LANs. y Define virtual SCSI devices.
In our example, we had an IBM eServer p5 550 Express with four CPUs that was running one AIX 5.3 database server LPAR, and we needed to create a second application server LPAR that uses a virtual SCSI disk as its boot disk. We wanted to share one Ethernet adapter between the database and application server LPARs and use this shared adapter to access an external network. Finally, we needed a private network between the two LPARs and we decided to implement it using virtual Ethernet devices (see Figure 3). We followed these steps to set up our system:

1. Enabled the Advanced POWER Virtualization feature.

2.Installed the Virtual I/

Virtual I/O Server installation & administration

The Virtual I/O Server The Virtual I/O Server is a dedicated partition that runs a special operating system called IOS. This special type of partition has physical resources assigned to it in its HMC profile. The administrator issues server partition IOS commands to create virtual resources which present virtual LAN, virtual SCSI adapters, and virtual disk drives client partitions. The client partition’s operating systems recognize these resources as physical devices. The Virtual I/O Server is responsible for managing the interaction between the client LPAR and the physical device supporting the virtualized service. Once the administrator logs in to the Virtual I/O Server as the user padmin, he or she has access to a restricted Korn shell session. The administrator uses IOS commands to create, change, and remove these physical and virtual devices as well as to configure and manage the VIO server. Executing the help command on the VIO server command line lists the commands that are available in padmin’s restricted Korn Shell session

Virtual I/O Server installation

VIO Server code is packaged and shipped as an AIX mksysb image
on a VIO DVD
Installation methods
– DVD install
– HMC install - Open rshterm and type “installios”; follow the
prompts
– Network Installation Manager (NIM)
VIO Server can support multiple client types
– AIX 5.3
– SUSE Linux Enterprise Server 9 or 10 for POWER
– Red Hat Enterprise Linux AS for POWER Version 3 and 4

Virtual I/O Server Administration
The VIO server uses a command line interface running in a restricted shell
– no smitty or GUI
There is no root login on the VIO Server
A special user – padmin – executes VIO server commands
First login after install, user padmin is prompted to change password
After that, padmin runs the command “license –accept”
Slightly modified commands are used for managing devices, networks,
code installation and maintenance, etc.
The padmin user can start a root AIX shell for setting up third-party
devices using the command “oem_setup_env”

We can get all commands by executing help on padmin user id

$ help
Install Commands
Physical Volume Commands
Security Commands
updateios
lspv
lsgcl
lssw
migratepv
cleargcl
ioslevel
lsfailedlogin
remote_management
Logical Volume Command
oem_setup_env
lslv
UserID Commands
oem_platform_level
mklv
mkuser
license
extendlv
rmuser
rmlv
lsuser
LAN Commands
mklvcopy
passwd
mktcpip
rmlvcopy
chuser
hostname
cfglnagg
netstat
Volume Group Commands
Maintenance Commands
entstat
lsvg
chlang
cfgnamesrv
mkvg
diagmenu
traceroute
chvg
shutdown
ping
extendvg
fsck
optimizenet
reducevg
backupios
lsnetsvc
mirrorios
savevgstruct
unmirrorios
restorevgstruct
Device Commands
activatevg
starttrace
mkvdev
deactivatevg
stoptrace
lsdev
importvg
cattracerpt
lsmap
exportvg
bootlist
chdev
syncvg
snap
rmdev
startsysdump

cfgdev
topas
mkpath
mount
chpath
unmount
lspath
showmount
rmpath
startnetsvc
errlog
stopnetsvc

Virtual I/O Server Overview

What is Advanced POWER Virtualization (APV)
APV – the hardware feature code for POWER5 servers that enables:
– Micro-partitioning – fractional CPU entitlements from a shared pool of
processors, beginning at one-tenth of a CPU
– Partition Load Manager (PLM) – a policy-based, dynamic CPU and
memory reallocation tool
– Physical disks can be shared as virtual disks to client partitions
– Shared Ethernet Adapter (SEA) – A physical adapter or EtherChannel in
a VIO Server can be shared by client partitions. Clients use virtual
Ethernet adapters
Virtual Ethernet – a LPAR-to-LPAR Virtual LAN within a POWER5 Server
– Does not require the APV feature code

Why Virtual I/O Server?
POWER5 systems will support more partitions than physical I/O slots
available
– Each partition still requires a boot disk and network connection, but
now they can be virtual instead of physical
VIO Server allows partitions to share disk and network adapter resources
– The Fibre Channel or SCSI controllers in the VIO Server can be
accessed using Virtual SCSI controllers in the clients
– A Shared Ethernet Adapter in the VIO Server can be a layer 2 bridge
for virtual Ethernet adapters in the clients
The VIO Server further enables on demand computing and server
consolidation

Virtualizing I/O saves:
– Gbit Ethernet Adapters
– 2 Gbit Fibre Channel Adapters
– PCI slots
– Eventually, IO drawers
– Server frames?
– Floor space?
– Electric, HVAC?
– Ethernet switch ports
– Fibre channel switch ports
– Logistics, scheduling, delays of physical Ethernet, SAN attach
Some servers run 90% utilization all the time – everyone knows which
ones.
Average utilization in the UNIX server farm is closer to 25%. They don’t
all maximize their use of dedicated I/O devices
VIO is departure from “new project, new chassis” mindset

Virtual I/O Server Characteristics

Requires AIX 5.3 and POWER5 hardware with APV feature
Installed as a special purpose, AIX-based logical partition
Uses a subset of the AIX Logical Volume Manager and attaches
to traditional storage subsystems
Inter-partition communication (client-server model) provided via
the POWER Hypervisor
Clients “see” virtual disks as traditional AIX SCSI hdisks, although
they may be a physical disk or logical volume on the VIO Server
One physical disk on a VIO server can provide logical volumes for
several client partitions

Virtual Ethernet
Virtual Ethernet
– Enable inter-lpar communications without a physical adapter
– IEEE-compliant Ethernet programming model
– Implemented through inter-partition, in-memory communication
VLAN splits up groups of network users on a physical network onto
segments of logical networks
Virtual switch provides support for multiple (up to 4K) VLANs
– Each partition can connect to multiple networks, through one or more adapters
– VIO server can add VLAN ID tag to the Ethernet frame as appropriate.
Ethernet switch restricts frames to ports that are authorized to receive frames
with specific VLAN ID
Virtual network can connect to physical network through “routing"
partitions – generally not recommended

Why Multiple VIO Servers?
Second VIO Server adds extra protection to client LPARS
Allows two teams to learn VIO setup on single system
Having Multiple VIO Servers will:
– Provide you Multiple paths to your OS/Data Virtual disks
– Provide you Multiple paths to your network
Advantages:
– Highest superior availability to other virtual I/O solutions
– Allows VIO Server updates without shutting down client LPAR’s

Saturday, April 26, 2008

Monitoring and Troubleshooting a Cluster

This chapter presents general information for monitoring and troubleshooting an HACMP for Linux configuration.
This chapter contains the following sections:
•Problem Determination Tools
•Viewing Cluster Information (clstat) in WebSMIT
•Useful Commands
•Logging Messages
•Solving Common Problems with Networks and Applications.
Problem Determination Tools
WebSMIT Problem Determination Tools menu has a set of tools for troubleshooting and recovering from problems that may arise in a cluster environment.
The Problem Determination Tools panel in WebSMIT includes:
•View Current State. WebSMIT displays cluster information using a slightly different layout and organization. Cluster components are displayed along their status. Expanding the item reveals additional information about it, including network, interfaces and active resource groups.
•HACMP Log Viewing and Management. Contains utilities that display or manage logs maintained by HACMP. These include the log file named hacmp.out, which keeps a record of all of the local cluster events as performed by the HACMP event scripts. These HACMP event scripts automate many common system administration tasks, and, in the event of a failure, will manage HACMP and system resource to provide recovery.
•Recover From HACMP Script Failure. Contains a command that HACMP will run to recover from a script failure. This is useful if the Cluster Manager is in reconfiguration due to a failed event script. Use this option after having manually fixed the error condition.
•Restore HACMP Configuration Database from Active Configuration.
Viewing Cluster Information (clstat) in WebSMIT
With HACMP 5.4.1, you can use WebSMIT to:
•Display detailed cluster information
•Navigate and view the status of the running cluster
•Configure and manage the cluster
•View graphical displays of sites, networks, nodes and resource group dependencies.
Useful Commands
You have these additional utilities:
•To view the resource group location and status, use the clRGinfo command.
•To view the service IP label information, run the ifconfig command on the node that currently owns the resource group.
For a list of commands supported in HACMP for Linux, see Command Reference in Appendix A: Command Reference and the clinfo Utility.
Logging Messages
HACMP for Linux uses the standard logging facilities for HACMP. For information about logging in HACMP, see the HACMP for AIX Troubleshooting Guide.
To troubleshoot the HACMP operations in your cluster, use the event summaries in the hacmp.out file and syslog.
The system logs messages into the following files:
•/tmp/clstrmgr.debug
•/tmp/cspoc.log
•/tmp/clappmond
•/tmp/hacmp.out
•/usr/es/adm/cluster.log
•/var/hacmp/clcomd/clcomd.log
•/var/hacmp/clcomd/clcomddiag.log
•/var/hacmp/log/clutils.log
•/usr/es/sbin/cluster/wsm/logs/wsm_smit.*
•/websmit/logs/wsm_smit.*
•/usr/es/sbin/cluster/snapshots/*
Collecting Cluster Log Files for Problem Reporting
To view the system files and log files as they are collected in an archive file:
1.In WebSMIT, go to the Collect Cluster log files for Problem Reporting menu.
2.Type or select values in entry fields.
3.Use an appropriate Linux tool to extract or view the archive file. The archive file contains the log and system files.

Resetting Cluster Tunables

You can change the settings for a list of tunable values that were altered during cluster maintenance and reset them to their default settings, or installation-time cluster settings. The installation-time cluster settings are equal to the values that appear in the cluster after installing HACMP from scratch.
Note:Resetting the tunable values does not change any other aspects of the configuration, while installing HACMP removes all user-configured configuration information including nodes, networks, and resources.
To reset the cluster tunable values:
1.Stop the cluster services.
2.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
3.In WebSMIT, select Extended Configuration > Extended Topology Configuration > Configure an HACMP Cluster > Reset Cluster Tunables and press Continue.
Use this option to reset all the tunables (customizations) made to the cluster. For a list of the tunable values that will change, see the section Listing Tunable Values. Using this option returns all tunable values to their default values but does not change the cluster configuration. HACMP takes a snapshot file before resetting. You can choose to have HACMP synchronize the cluster when this operation is complete.
4.Select the options as follows and press Continue:
Synchronize Cluster Configuration
If you set this option to yes, HACMP synchronizes the cluster after resetting the cluster tunables.
5.HACMP asks: “Are you sure?”
6.Press Continue.
HACMP resets all the tunable values to their original settings and removes those that should be removed (such as the nodes’ knowledge about customized pre- and post-event scripts).
Resetting HACMP Tunable Values using the Command Line
We recommend that you use the SMIT interface to reset the cluster tunable values. The clsnapshot -t command also resets the cluster tunables. This command is intended for use by IBM support. See the man page for more information.
Listing Tunable Values
You can change and reset the following list of tunable values:
•User-supplied information.
•Network module tuning parameters, such as, failure detection rate, grace period and heartbeat rate. HACMP resets these parameters to their installation-time default values.
•Cluster event customizations, such as, all changes to cluster events. Note that resetting changes to cluster events does not remove any files or scripts that the customization use; it only removes the knowledge HACMP has of pre- and post-event scripts.
•Cluster event rule changes made to the event rules database are reset to the installation-time default values.
•HACMP command customizations made to the default set of HACMP commands are reset to the installation-time defaults.
•Automatically generated and discovered information.
Generally users cannot see this information. HACMP rediscovers or regenerates this information when the cluster services are restarted or during the next cluster synchronization.
HACMP resets the following:
•Local node names stored in the cluster definition database
•Netmasks for all cluster networks
•Netmasks, interface names and aliases for disk heartbeating (if configured) for all cluster interfaces
•SP switch information generated during the latest node_up event (this information is regenerated at the next node_up event)
•Instance numbers and default log sizes for the RSCT subsystem.
Understanding How HACMP Resets Cluster Tunables
HACMP resets tunable values to their default values under the following conditions:
•Before resetting HACMP tunable values, HACMP takes a cluster snapshot. After the values have been reset to defaults, if you want to go back to your customized cluster settings, you can restore them with the cluster snapshot. HACMP saves snapshots of the last ten configurations in the default cluster snapshot directory, /usr/es/sbin/cluster/snapshots, with the name active.x.odm, where x is a digit between 0 and 9, with 0 being the most recent.
•Stop cluster services on all nodes before resetting tunable values. HACMP prevents you from resetting tunable values in a running cluster.
In some cases, HACMP cannot differentiate between user-configured information and discovered information, and does not reset such values. For example, you may enter a service label and HACMP automatically discovers the IP address that corresponds to that label. In this case, HACMP does not reset the service label or the IP address. The cluster verification utility detects if these values do not match.
The clsnapshot.log file in the snapshot directory contains log messages for this utility. If any of the following scenarios are run, then HACMP cannot revert to the previous configuration:
•cl_convert is run automatically
•cl_convert is run manually

System Management (C-SPOC) Tasks

Use the System Management (C-SPOC) panel in WebSMIT to configure from one node the resources that are shared among nodes. System Management utility of HACMP lets you administer many aspects of the cluster and its components from one Cluster Single Point of Control (C-SPOC). By automating repetitive tasks, C-SPOC eliminates a potential source of errors, and speeds up the cluster maintenance process.
In WebSMIT, you access C-SPOC using the System Management (C-SPOC) menu.
In this panel, you can do the following tasks from one node:
•Manage HACMP services, or start and stop cluster services: Cluster Manager (clstrmgr) and Cluster Information (clinfo).
•HACMP Communication Interface Management. Manage the communication interfaces of existing cluster nodes using C-SPOC.
•HACMP Resource Group and Application Management Provides menus to manage cluster resource groups and analyze cluster applications.
•HACMP Log Viewing and Management. Manage, view, and collect HACMP log files and event summaries.
Starting HACMP Cluster Services
To start HACMP cluster services:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select System Management (C-SPOC) > Manage HACMP Services > Start HACMP Services and press Continue.
For detailed instructions, see the HACMP on AIX Administration Guide.
Stopping HACMP Cluster Services
To stop HACMP cluster services:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select System Management (C-SPOC) > Manage HACMP Services > Start HACMP Services and press Continue.
For detailed instructions, see the HACMP on AIX Administration Guide.
Managing Resource Groups and Applications
To manage resource groups and applications:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select System Management (C-SPOC) > HACMP Resource Group and Application Management and press Continue.
Viewing and Managing Logs
To view and manage logs:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select System Management (C-SPOC) > HACMP Log Viewing and Management and press Continue.
For detailed instructions, see the HACMP on AIX Administration Guide.

Viewing the Cluster Status

HACMP has a cluster status utility, the /usr/es/sbin/cluster/clstat. It reports the status of key cluster components—the cluster itself, the nodes in the cluster, the network interfaces connected to the nodes, and the resource groups on each node.
clstat is available in WebSMIT at the left side of the top-level menu. It displays an expandable list of cluster components along with their status. The cluster status display window shows information and status (up or down, online, offline or error) on cluster nodes, networks, interfaces, application servers and resource groups. For resource groups, it also shows the node on which the group is currently hosted.
Here is an example of the clstat output in WebSMIT. This is the left-hand side panel of the window:

Figure 2. clstat Output
Here is an example of the ASCII-based output from the clstat command, used on a Linux cluster with nodes named ppstest1 and ppstest2:
ppstest2:~ # /usr/es/sbin/cluster/clstat
clstat - HACMP Cluster Status Monitor
-------------------------------------
Cluster: test1234 (1148058900)
Wed May 17 16:45:41 2006
State: UP Nodes: 4
SubState: STABLE
Node: ppstest1 State: UP
Interface: tr0 (6) Address: 9.57.28.3
State: UP
Resource Group: rg1 State: On line
Node: ppstest2 State: UP
Interface: tr0 (6) Address: 9.57.28.4
State: UP
Resource Group: rg2 State: On line
Node: ppstest3 State: UP
Interface: tr0 (6) Address: 9.57.28.5
State: UP
Node: ppstest4 State: UP
Interface: tr0 (6) Address: 9.57.28.6
State: UP
Resource Group: rg3 State: On line
Resource Group: rg4 State: On line

Configuring HACMP Application Servers

To configure an application server on any cluster node:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP Application Servers > Add an Application Server and press Continue.
WebSMIT displays the Add an Application Server panel.
3.Enter field values as follows:
Server Name
Enter an ASCII text string that identifies the server. You will use this name to refer to the application server when you define resources during node configuration. The server name can include alphabetic and numeric characters and underscores. Use no more than 64 characters.
Start Script
Enter the pathname of the script (followed by arguments) called by the cluster event scripts to start the application server. (Maximum 256 characters.) This script must be in the same location on each cluster node that might start the server. The contents of the script, however, may differ.
Stop Script
Enter the pathname of the script called by the cluster event scripts to stop the server. (Maximum 256 characters.) This script must be in the same location on each cluster node that may start the server. The contents of the script, however, may differ.
4.Press Continue to add this information to the HACMP Configuration Database on the local node.
5.Add the application start, stop and notification scripts to every node in the cluster.
Verifying Application Servers
Make sure that the application start, stop and notification scripts exist and are executable on every node in the cluster. Use the cllsserv command.
For example:
ppstest2:~ # /usr/es/sbin/cluster/utilities/cllsserv
app_test2_primary /usr/local/app_start /usr/local/app_stop
ppstest2:~ # ls -l /usr/local/app_start
-rwxr--r-- 1 root root 169 May 10 22:54 /usr/local/app_start
Configuring Application Monitors
Once you configured application servers, HACMP for Linux lets you have application monitors that will check the health of the running application process, or check for the successful start of the application.
To configure application monitors:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Applications > Configure HACMP Application Monitoring and press Continue. A selector screen appears for Configure Process Application Monitoring and Configure Custom Application Monitoring.
3.Select the type of monitoring you want and press Continue.
4.Select the application server to which you want to add a monitor.
5.Fill in the field values and press Continue.
For additional reference information on application monitoring, its types, modes, and other information, see the HACMP for AIX Administration Guide.
Including Resources into Resource Groups
Once you configure resources to HACMP, you include them in resource groups so that HACMP can manage them as a single set. For example, if an application depends on the service IP label, you can add it to a single resource group.
HACMP manages the resources in a resource group by bringing the resource groups online and offline on their home node(s), or moving them to other nodes, if necessary for recovery.
Note:For detailed instructions on resource groups, see the HACMP for AIX Administration Guide. This guide contains descriptions of procedures in HACMP SMIT, and the options are identical to those used in WebSMIT in HACMP for Linux.
Resource Group Management: Overview
In the Extended Configuration > Extended Resource Configuration > HACMP Extended Resource Group Configuration WebSMIT screen, you can:
•Add a resource group.
•Change/Show a resource group. The system displays all previously defined resource groups. After selecting a particular resource group, you can view and change the group name, node relationship, and participating nodes (nodelist). You can also change the group’s startup, fallover and fallback policies.
•Remove a resource group.
•Change/Show resources for a resource group. Add resources, such as a service IP label for the application, or an application server, to a resource group. HACMP always activates and brings offline these resources on a particular node as a single set. If you want HACMP to activate one set of resources on one node and another set of resources on another node, create separate resource groups for each set.
•Show all resources by node for a resource group.
HACMP for Linux does not allow to change resources dynamically, that is, when HACMP cluster services are running on the nodes. To change the previously added resources, stop the cluster services.
Adding Resources to a Resource Group
To include resources into a resource group:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Change/Show All Resources and Attributes for a Resource Group and press Continue.
3.Fill in the field values and press Continue. HACMP adds the resources.
For additional information on adding or changing resources in resource groups, and for information on other resource group management tasks, see the Administration Guide.
Synchronizing the HACMP Cluster Configuration
We recommend that you do all the configuration from one node and synchronize the cluster to propagate this information to other nodes.
Use this WebSMIT option to commit and distribute your changes automatically to all of the specified nodes.
To synchronize an HACMP cluster configuration:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Verification and Synchronization and press Continue.
If you configured the cluster correctly, HACMP synchronizes the configuration. HACMP issues errors if the configuration is not valid.
Displaying the HACMP Cluster Configuration
You can ask HACMP to show you the status of different configured components. The WebSMIT options for displaying different cluster entities are grouped together with the options for adding them to the cluster.
Here are some examples of the options you have:
•Show HACMP Topology by node, by network name, or by communication interface
•Change/Show Persistent IP Labels
•Show Cluster Applications and change/show application monitors per application
•Change/Show Service IP Labels
•Show all Resources by Node or Resource Groups
•View cluster logs (In WebSMIT, it is under System Management > Log Viewing and Management)
•Show Cluster Services (whether running or not).

Configuring Service IP Labels

To add service IP labels/addresses as resources to the resource group in your cluster:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Resource Configuration > HACMP Extended Resources Configuration > Configure HACMP Service IP Labels/Addresses > Add a Service IP Label/Address and press Continue.
3.Fill in field values as follows:
IP Label/Address
Enter, or select from the picklist the IP label/address to be kept highly available.
Network Name
Enter the symbolic name of the HACMP network on which this Service IP label/address will be configured.
4.Press Continue after filling in all required fields.
5.Repeat the previous steps until you have configured all service IP labels/addresses for each network, as needed.

WebSMIT Tasks Overview

The main WebSMIT menu in HACMP for Linux contains the following menu items and tabs:
•Extended Configuration to configure your cluster.
•System Management (C-SPOC). C-SPOC (Cluster Single Point of Control) is an HACMP function that lets you run HACMP cluster-wide configuration commands from one node in the cluster. In HACMP for Linux, you can use System Management (C-SPOC) to start and stop the cluster services and to move, bring online and bring offline resource groups.
•Problem Determination Tools. You can customize cluster verification, view current cluster state, view logs, recover from a cluster event failure, configure error notification methods and perform other troubleshooting tasks.
•HACMP Documentation. This is the top-level tab that contains a page with links to all online and printable versions of HACMP documentation, including this guide.
Here is the top-level HACMP for Linux WebSMIT menu:

Tasks for Configuring a Basic Cluster

You configure an HACMP for Linux cluster using the Extended Configuration path in WebSMIT.
Note:In general, the sections in this guide provide a high-level overview of user interface options. See the HACMP for AIX Administration Guide for detailed procedures, field help, and recommendations for configuring each and every HACMP component.
To configure a basic cluster:
1.On one cluster node, configure a cluster name and add cluster nodes. See:
•Defining a Cluster Name
•Adding Nodes and Establishing Communication Paths
2.On each cluster node, configure all supporting networks and interfaces: serial networks for heartbeating and IP-based cluster networks for cluster communication.
Also, configure communication devices (that you must have previously defined to the operating system) to HACMP. Configure boot network interfaces (that you must have previously defined to the operating system) to HACMP. Also, configure persistent IP labels for cluster administration purposes. See:
•Configuring Serial Networks for Heartbeating
•Adding IP-Based Networks
•Configuring Communication Interfaces/Devices to HACMP
•Adding Persistent IP Labels for Cluster Administration Purposes
3.On one cluster node, configure cluster resources that will be associated with the application: service IP labels, application servers and application monitors. See:
•Configuring Resources to Make Highly Available
•Configuring Service IP Labels
•Configuring Application Servers
•Configuring Application Monitors
4.On one cluster node, include resources into resource groups. See Including Resources into Resource Groups.
5.Synchronize the cluster configuration. See Synchronizing the HACMP Cluster Configuration.
6.View the HACMP cluster configuration. See Displaying the HACMP Cluster Configuration.
7.Start the HACMP for Linux cluster services on the cluster nodes. When you do so, HACMP will activate the resource group with the application, and will start monitoring it for high availability. See Starting HACMP Cluster Services.
Defining a Cluster Name
Before starting to configure a cluster:
•Make sure that you added all necessary entries to the /etc/hosts file on each machine that will serve as a cluster node. See Planning IP Networks and Network Interfaces.
•Make sure that WebSMIT is installed and can be started on one of the nodes. See Installing and Configuring WebSMIT.
•Log in to WebSMIT. See Starting WebSMIT.
The only step necessary to configure a cluster is to assign the cluster name. When you assign a name to your cluster in WebSMIT, HACMP associates this name with the HACMP-assigned cluster ID.
To assign a cluster name and configure a cluster:
1.Log in to a URL where WebSMIT is installed. The browser window displays the top-level WebSMIT screen.
2.In WebSMIT, select Extended Configuration > Extended Topology Configuration > Configure an HACMP Cluster > Add/Change/Show an HACMP Cluster and press Continue.
3.Enter field values as follows:
Cluster Name
Enter an ASCII text string that identifies the cluster. The cluster name can include alphanumeric characters and underscores, but cannot have a leading numeric. Use no more than 32 characters. Do not use reserved names. For a list of reserved names see List of Reserved Words.
4.Press Continue. If you are changing an existing cluster name, restart HACMP for changes to take effect.

IBM AIX Blog