Chapter 3. System Operation

This chapter describes how to use the SMC for Altix ICE systems management software to operate your Altix ICE system and covers the following topics:

Software Image Management

This section describes image management operations.

This section describes Linux services turned off on compute nodes by default, how you can customize the software running on compute nodes or service nodes, create a simple clone image of compute node or service node software, how to use the cimage command, how to use crepo command to manage software image reposistories, and how to use the cinstallman command to create compute and service node images. It covers these topics:

Compute Node Services Turned Off by Default

To improve the performance of applications running MPI jobs on compute nodes, most services are disabled by default in compute node images. To see what adjustments are being made, view the /etc/opt/sgi/conf.d/80-compute-distro-services script.

If you wish to change anything in this script, SGI suggests that you copy the existing script to .local and adjust it there. Perform the following commands:

# cd /var/lib/systemimager/images/compute-image-name
# cp etc/opt/sgi/conf.d/80-compute-distro-services 80-compute-distro-services.local
# vi etc/opt/sgi/conf.d/80-compute-distro-services.local

At this point, the configuration framework will execute the .local version, and skip the other. For more information on making adjustments to configuration framework files, see “SGI Altix ICE System Configuration Framework”.

Use the cimage command to push the changed image out to the leader nodes.

crepo Command

You can use the crepo command to manage software repositories, such as, SGI Foundation, SMC for Altix ICE, SGI Performance Suite, and the Linux distribution(s) you are using on your system. You also use the crepo command to manage any custom repositories you create yourself.

The configure-cluster command calls the crepo command when it prompts you for media and then makes it available. You can also use the crepo command to add additional media.

Each repository has associated with it a name, directory, update URL, selection status, and suggested package lists. The update URL is used by the sync-repo-updates command. The directory is where the actual yum repository exists, and is located in one of these locations, as follows:

Repository
 

Description

/tftpboot/sgi/*
 

For SGI media

/tftpboot/other/*
 

For any media that is not from SGI

/tftpboot/distro/*
 

For Linux distribution repositories such as SLES or RHEL

/tftpboot/x
 

Customer-supplied repositories

The repository information is determined from the media itself when adding media supplied by SGI, Linux distribution media (SLES, RHEL, and so on.), and any other YaST-compatible media. For customer-supplied repositories, the information must be provided to the crepo command when adding the repository.

Repositories can be selected and unselected. Usually, SMC for Altix ICE commands ignore unselected repositories. One notable exception is that sync-repo-updates always operates on all repositories.

The crepo command constructs default RPM lists based on the suggested package lists. The RPM lists can be used by the cinstallman command when creating a new image. These RPM lists are only generated if a single distribution is selected and can be found in /etc/opt/sgi/rpmlists; they match the form generated- *.rpmlist. The crepo command will tell you when it updates or removes generated rpmlists. For example:

# crepo --select SUSE-Linux-Enterprise-Server-10-SP3
Updating: /etc/opt/sgi/rpmlists/generated-compute-sles10sp3.rpmlist
Updating: /etc/opt/sgi/rpmlists/generated-service-sles10sp3.rpmlist

When generating the RPM lists, the crepo command combines the a list of distribution RPMs with suggested RPMs from every other selected repository. The distribution RPM lists are usually read from the /opt/sgi/share/rpmlists/distro directory. For example, the compute node RPM list for sles11sp1 is /opt/sgi/share/rpmlists/distro/compute-distro-sles11sp1.rpmlist . The suggested RPMs for non-distribution repositories are read from the /var/opt/sgi/sgi-repodata directory. For example, the rpmlist for SLES 11 SP1 compute nodes is read from /var/opt/sgi/sgi-repodata/SMC-for-ICE 1.0-for-Linux-sles11/smc-ice-compute.rpmlist .

The suggested rpmlists can be overridden by creating an override rpmlist in the /etc/opt/sgi/rpmlists/override/ directory. For example, to change the default SMC for ICE 1.0 suggested rpmlist, a file /etc/opt/sgi/rpmlists/override/SMC-for-ICE-1.0 -for-Linux-sles11/smc-ice-compute.rpmlist can be created.

The following example shows the contents of the /etc/opt/sgi/rpmlists directory after the crepo command has created the suggested RPM lists. Change directory (cd) to the /etc/opt/sgi/rpmlists directory. Use the ls command to see a list of rpms, as follows:

admin distro]# ls
compute-distro-centos5.4.rpmlist  lead-distro-sles11sp1.rpmlist
compute-distro-rhel5.4.rpmlist    service-distro-rhel5.4.rpmlist
compute-distro-rhel5.5.rpmlist    service-distro-rhel5.5.rpmlist
compute-distro-rhel6.0.rpmlist    service-distro-rhel6.0.rpmlist
compute-distro-sles10sp3.rpmlist  service-distro-sles10sp3.rpmlist
compute-distro-sles11sp1.rpmlist  service-distro-sles11sp1.rpmlist
lead-distro-rhel6.0.rpmlist

Specifically, SMC for Altix ICE software looks for /etc/opt/sgi/rpmlists/generate-*.rpmlist and creates an image for each rpmlist that matches.

It also determines the default image to use for each node type by hard-coding "$nodeType-$distro" as the type, where distro is the admin node's distro and nodeType is compute, service, leader, and so on. The default image can be overridden by specifying a global cattr attribute named image_default_$nodeType; for example, image_default_service. Use cattr -h, for information about the cattr command.

The following example shows the contents of the /etc/opt/sgi/rpmlists directory after the crepo command has created the suggested RPM lists. The files with -distro- in the name are the base Linux distro RPMs that SGI recommends.

Change directory (cd) to /etc/opt/sgi/rpmlists. Use the ls command to see a list of rpms, as follows:

admin:/etc/opt/sgi/rpmlists # ls
compute-minimal-sles11sp1.rpmlist  generated-lead-rhel6.0.rpmlist
generated-compute-rhel6.0.rpmlist  generated-service-rhel6.0.rpmlist

For more information on rpmlist customization information, see “Creating Compute and Service Node Images Using the cinstallman Command”.

For a crepo command usage statement, perform the following:

admin:~ # crepo --h
crepo Usage:
Operations:
--help                : print his usage message

--add {path/URL}      : add SGI/SMC media to the system repositories
       --custom {name}: Optional.Use with -add to add custom repo under
                        /tftpboot Repo must pre-exist for this case.

--del {product}       : delete an add-on product and associated /tftpboot repo

--select {product}    : mark the product as selected

--show                : show available add-on products

--show-distro         : like show, but only reports distro media like sles10sp2

--show-updateurls     : Show the update sources associated add-on products

--reexport            : re-export all repositories with yume.  Use if there
                        was a yume export problem previously.

--unselect {product}  : mark the product as not selected


Flags:

Note for --add: If the pathname is local to the machine, it can be an
ISO file or mounted media.  If a network path is used -- such as an nfs
path or a URL -- the path must point to an ISO file.  The argument to
--add may be a comma delimited list to specify multiple source media.

Use --add for SGI/SMC media, to make the repos and rpms available. If the
supplied SGI/SMC media has suggested rpms from SMC node types, those
suggested rpms will be integrated with the default rpmlists for leader,
service, and compute nodes.  You can use create-default-sgi-images to
re-create the default images including new suggested packages or you can
just browse the updated versions in /etc/opt/sgi/rpmlists.

Use --add with --custom to register your own custom repository.  This will
ensure that, by default, the custom repository is available to yume and
mksiimage commands.  It is assumed you will maintain your own default package
lists, perhaps using the sgi default package lists in /etc/opt/sgi/rpmlists
or /opt/sgi/share/rpmlists as a starting point.  The directory and rpms within
must pre-exist.  This script will create the yum metadata for it.
Example:
crepo --add /tftpboot/myrepo --custom my-custom-name

cinstallman Command

The cinstallman command is a wrapper tool for several SMC for Altix ICE operations that previously ran separately. You can use the cinstallman command to perform the following:

  • Create an image from scratch

  • Clone an existing image

  • Recreate an image (so that any nodes associated with said image prior to the command are also associated after)

  • Use existing images that may have been created by some other means

  • Delete images

  • Show available images

  • Update or manage images (via yume)

  • Update or manage nodes (via yume)

  • Assign images to nodes

  • Choose what a node should do next time it reboots (image itself or boot from its disk)

  • Refresh the bittorrent tarball and torrent file for a compute node image after making changes to the expanded image

When compute images are created for the first time, a bittorrent tarball is also created. When images are pushed to rack leaders for the first time, bittorrent is used to transport the tarball snapshot of the image. However, as you make adjustments to your compute image, those changes do not automatically generate a new bittorrent tarball. We handle that situation by always doing a follow-up rsync of the compute image after transporting the tarball. However, as your compute image begins to diverge from the bittorrent tarball snapshot, it becomes less and less efficient to transport a given compute node image that is new to a given rack leader.

You no longer need to use yum, yume , or mksiimage commands directly for most common operations. Compute images are automatically configured in such a way as to make them available to the cimage command.

For a cinstallman command usage statement, perform the following:

admin:~ # cinstallman --help
cinstallman Usage:

cinstallman is a tool that manages:
 - image creation (as a wrapper to mksiimage)
 - node package updates (as a wrapper to yume)
 - image package updates (yume within a chroot to the image)

This is a convenience tool and not all operations for the commands that are
wrapped are provided.  The most common operations are collected here for
ease of use.

For operations that take the --node parameter, the node can be an aggregation
of nodes like cimage and cpower can take.  Depending on the situation,
non-managed or offline nodes are skipped.

The tool retrieves the registered repositories from crepo so that they
need not be specified on the command line.

Operations:
--help                  : print his usage message
--create-image          : create a new systemimager image
                          By default, requires --rpmlist and --image
                          Optional flags  below:
       --clone          : Clone existing image, requires --source, --image.
                          Doesn't require --rpmlist.
       --recreate       : Like --del-image then --add-image, but preserves any
                          node associations.
                          Requires --image and --rpmlist
       --repos {list}   : A comma-seperated list of repositories to use.
       --use-existing   : register an already existing image, doesn't
                          require --rpmlist
       --image {image}  : Specify the image to operate on
       --rpmlist {path} : Provide the rpmlist to use when creating images
       --source {image} : Specify a source image to operate on (for clone)

--del-image             : delete the image, may use with --del-nodes
       --image {image}  : Specify the image to operate on

--show-images           : List available images (similar to mksiimage -L)

--show-nodes            : Show non-compute nodes (similar to mksimachine -L)

--update-image          : update packages in image to latest packages available
                          in repos, Requires --image
       --image {image}  : Specify the image to operate on

--refresh-image         : Refresh the given image to include all packages
                          in the supplied rpmlist.  Use after registering
                          new media with crepo that has new suggested rpms.
       --image {image}  : Specify the node or nodes to operate on
       --rpmlist {path} : rpmlist containing packes to be sure are included

--yum-image             : Perform yum operations to supplied image, via yume
                          Requires --image, trailing arguments passed to yume
       --image {image}  : Specify the image to operate on

--update-node           : Update supplied node to latest pkgs avail in
                          repos, requires --node
       --node {node}    : Specify the node or nodes to operate on

--refresh-node          : Refresh the given node to include all packages
                          in the supplied rpmlist.  Use after registering
                          new media with crepo that has new suggested rpms.
       --node {node}    : Specify the node or nodes to operate on
       --rpmlist {path} : rpmlist containing packes to be sure are included

--yum-node              : Perform yum operations to nodes, via yume.  Requires
                          --node.  Trailing arguments passed to yume
       --node {node}    : Specify the node or nodes to operate on

--assign-image          : Assign image to node.  Requires --node, --image
       --node {node}    : Specify the node or nodes to operate on
       --image {image}  : Specify the image to operate on

--next-boot {image|disk}: node action next boot: boot from disk or
                          reinstall/reimage?  Requires --node

--refresh-bt            : Refresh the bittorrent tarball and torrent file
                          Requires --image
       --image {image}  : Specify the image to operate on

In the following example, the --refresh-node operation is used to ensure the online managed service nodes include all the packages in the list. You could use this if you updated your rpmlist to include new packages or if you recently added new media with the crepo command and want running nodes to have the newly updated packages. A similar --refresh-image operation exists for images.

# cinstallman --refresh-node --node service\* --rpmlist
/etc/opt/sgi/rpmlists/service-sles11.rpmlist

Customizing Software On Your SGI Altix ICE System

This section discusses how to manage various nodes on your SGI Altix ICE system. It describes how to configure the various nodes, including the compute and service nodes. It describes how to augment software packages. Many tasks having to do with package management have multiple valid methods to use.

For information on installing patches and updates, see “Installing SMC for Altix ICE Patches and Updating SGI Altix ICE Systems ” in Chapter 2.

Creating Compute Node Custom Images

You can add per-host compute node customization to the compute node images. You do this by adding scripts either to the /opt/sgi/share/per-host-customization/global/ directory or the /opt/sgi/share/per-host-customization/mynewimage/ directory on the system admin controller.


Note: When creating custom images for compute nodes, make sure you clone the original SGI images. This provides the original images intact that you can fall back to if necessary. The folowing example is based on SLES.


Scripts in the global directory apply to all compute nodes images. Scripts under the image name apply only to the image in question. The scripts are cycled through once per host when being installed on the rack leader controllers. They receive one input argument, which is the full path (on the rack leader controller) to the per-host base directory, for example, /var/lib/sgi/mynewimage/i2n11. There is a README file at /opt/sgi/share/per-host-customization/README on the system admin controller, as follows:

This directory contains compute node image customization scripts which are
executed as part of the install-image operations on the leader nodes when
pulling over a new compute node image.

After the image has been pulled over, and the per-host-customization dir has
been rsynced, the per-host /etc and /var directories are populated, then the
scripts in this directory are cycled through once per-host.  This allows the
scripts to source the node specific network and cluster management settings,
and set node specific settings.

Scripts in the global directory are iterated through first, then if a
directory exists that matches the image name, those scripts are iterated
through next.

You can use the scripts in the global directory as examples.

An example global script, /opt/sgi/share/per-host-customization/global/sgi-fstab is, as follows:

#!/bin/sh
#
# Copyright (c) 2007,2008 Silicon Graphics, Inc.
# All rights reserved.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#  along with this program; if not, write to the Free Software
#  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
#
# Set up the compute node's /etc/fstab file.
#
# Modify per your sites requirements.
#
# This script is excecuted once per-host as part of the install-image
operation
# run on the leader nodes, which is called from cimage on the admin node.
# The full path to the per-host iru+slot directory is passed in as $1,
# e.g. /var/lib/sgi/per-host/<imagename>/i2n11.
#

# sanity checks
. /opt/sgi/share/per-host-customization/global/sanity.sh

iruslot=$1
os=( $(/opt/oscar/scripts/distro-query -i ${iruslot} | sed -n '/^compat
/s/^compat.*: //p') )
compatdistro=${os[0]}${os[1]}

if [ ${compatdistro} = "sles10" -o ${compatdistro} = "sles11" ]; then

	#
	# SLES 10 compatible
	#
	cat <<EOF >${iruslot}/etc/fstab
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
tmpfs           /tmp            tmpfs   size=150m       0       0
EOF

elif [ ${compatdistro} = "rhel5"  ]; then

	#
	# RHEL 5 compatible
	#

	#
	# RHEL expects several subsys directories to be present under
/var/run
	# and /var/lock, hence no tmpfs mounts for them
	#
	cat <<EOF >${iruslot}/etc/fstab
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
tmpfs           /tmp            tmpfs   size=150m       0       0
devpts          /dev/pts        devpts  gid=5,mode=620  0       0
EOF

else

	echo -e "\t$(basename ${0}): Unhandled OS.  Doing nothing"

fi

Modify Compute Image Kernel Boot Options

You can use the cattr command to set extra kernel boot parameters for compute nodes on a per-image basis. For example to append "cgroup_disable=memory" to kernel boot parameters for any node booting the "compute-sles11sp1" image, perform a command similar to the following:

% cattr set kernel_extra_params-compute-sles11sp1 cgroup_disable=memory

Push the image, as follows:
# cimage --push-rack mynewimage r1

Compute Node Per-Host Customization for Additional Network Interfaces


Note: The following example is only for systems running SLES.


Per compute-node customization may be useful for configuring additional network interfaces that are on some, but not all, compute nodes. An example of how to configure network interfaces on individual compute nodes is the /opt/sgi/share/per-host-customization/mynewimage/mycustomization script, that follows:
Copyright (c) 2008 Silicon Graphics, Inc.
# All rights reserved.
#
# do node specific setup
#
# This script is excecuted once per-host as part of the install-image operation
# run on the leader nodes, which is called from cimage on the admin node.
# The full path to the per-host iru+slot directory is passed in as $ARGV[0],
# e.g. /var/lib/sgi/per-host/<imagename>/i2n11.
#

use lib "/usr/lib/systemconfig","/opt/sgi/share/per-host-customization/global";
use sanity;

sanity_checks();

$blade_path = $node = $ARGV[0];
$node =~ s/.*\///;

sub i0n4 {
        my $ifcfg="etc/sysconfig/network/ifcfg-eth2";
        open(IFCFG, ">$blade_path/$ifcfg") or
                die "$0: can't open $blade_path/$ifcfg";
        print IFCFG<<EOF
BOOTPROTO='static'
IPADDR='10.20.0.1'
NETMASK='255.255.0.0'
STARTMODE='onboot'
WIRELESS='no'
EOF
        ;
        close(IFCFG);
}

@nodes = ("i0n4");

foreach $n (@nodes) {
        if ( $n eq $node ) {
                eval $n;
        }
}




Pushing mynewimage to rack 1 causes the eth2 interface of compute node r1i0n4 to be configured with IP address 10.20.0.1 when the node is brought up with mynewimage . Push the image, as follows:

# cimage --push-rack mynewimage r1

Customizing Software Images


Note: Procedures in this section describe how to work with service node and compute node images. Always use a cloned image. If you are adjusting an RPM list, use your own copy of the RPM list.


The service and compute node images are created during the configure-cluster operation (or during your upgrade from a prior release). This process uses an RPM list to generate a root on the fly.

You can clone a compute node image, or create a new one based on an RPM list. For service nodes, SGI does not support a clone operation. For compute images, you can either clone the image and work on a copy or you can always make a new compute node image from the SGI supplied default RPM list.

Procedure 3-1. Creating a Simple Compute Node Image Clone


    Note: Always work from a clone image, see “Customizing Software Images”.


    To create a simple compute node image clone from the system admin controller, perform the following steps:

    1. To clone the compute node image, perform the following:

      # cinstallman --create-image --clone --source compute-sles11 --image compute-sles11-new

    2. To see the images and kernels in the list, perform the following:

      # cimage --list-images
      image: compute-sles11
             kernel: 2.6.27.19-5-smp
      
      image: compute-sles11-new
             kernel: 2.6.27.19-5-smp

    3. To push the compute node image out to the rack, perform the following:

      # cimage --push-rack compute-sles11-new r\*

    4. To change the compute nodes to use the cloned image/kernel pair, perform the following:

      # cimage --set compute-sles11-new 2.6.27.19-5-smp "r*i*n*"

    Procedure 3-2. Manually Adding a Package to a Compute Node Image

      To manually add a package to a compute node image, perform the steps:


      Note: Use the cinstallman command to install packages into images when the package you are adding is in a repository. This example shows a quick way to manually add a package for compute nodes when you do not want the package to be in a custom repository. For information on the cinstallman command, see “cinstallman Command”.


      1. Make a clone of the compute node image, as described in “Customizing Software Images”.


      2. Note: This example shows SLES11.


        Determine what images and kernels you have available now, as follows:
        # cimage --list-images
          image: compute-sles11
                 kernel: 2.6.27.19-5-smp
        
          image: compute-sles11-new
                 kernel: 2.6.27.19-5-smp
        

      3. From the system admin controller, change directory to the images directory, as follows:

        # cd /var/lib/systemimager/images/

      4. From the system admin controller, copy the RPMs your wish to add, as follows, where compute-sles11-new is your own compute node image, as follows:

        # cp /tmp/newrpm.rpm compute-sles11-new/tmp

      5. The new RPMs now reside in /tmp directory in the image named compute-sles11-new. To install them into your new compute node image, perform the following commands:

        # chroot compute-sles11-new bash

        And then perform the following:

        # rpm -Uvh /tmp/newrpm.rpm

        At this point, the image has been updated with the RPM.


        Note: Remove the RPMs or ISO images before pushing an image or the RPM/ISO will be pushed multiple times for each image slowing down the push and even filling up the root of the RLC (leader node).

        # rm /tmp/newrpm.rpm



      6. The image on the system admin controller is updated. However, you still need to push the changes out. Ensure there are no nodes currently using the image and then run this command:

        # cimage --push-rack compute-sles11-new r\*

        This will push the updates to the rack lead controllers and the changes will be seen by the compute nodes the next time they start up. For information on how to ensure the image is associated with a given node, see the cimage --set command and the example in Procedure 3-1.

      Procedure 3-3. Manually Adding a Package to the Service Node Image

        To manually add a package to the service node image, perform the following steps:


        Note: Use the cinstallman command to install packages into images when the package you are adding is in a repository. This example shows a quick way to manually add a package for compute nodes when you do not want the package to be in a custom repository. For information on the cinstallman command, see “cinstallman Command”.


        1. Use the cinstallman command to create your own version of the service node image. See “cinstallman Command”.

        2. Change directory to the images directory, as follows:

          # cd /var/lib/systemimager/images/

        3. From the system admin controller, copy the RPMs your wish to add, as follows, where my-service-image is your own service node image:

          # cp /tmp/newrpm.rpm my-service-image/tmp

        4. The new RPMs now reside in /tmp directory in the image named my-service-image. To install them into your new service node image, perform the following commands:

          # chroot my-service-image bash

          And then perform the following:
          # rpm -Uvh /tmp/newrpm.rpm

          At this point, the image has been updated with the RPM. Please note, that unlike compute node images, changes made to a service node image will not be seen by service nodes until they are reinstalled with the image. If you wish to install the package on running systems, you can copy the RPM to the running system and use the RPM from there.

        cimage Command

        The cimage command allows you to list, modify, and set software images on the compute nodes in your system.

        For a help statement, perform the following command:

        admin:~ # cimage --help
        cimage is a program for managing compute node root images in SMC for ICE.
        
        Usage: cimage OPTION ...
        
        Options
         --help                                 Usage and help text.
         --debug                                Output additional debug information.
         --list-images                          List images and their kernels.
         --list-nodes NODE                      List node(s) and what they are set to.
         --set [OPTION] IMAGE KERNEL NODE       Set node(s) to image and kernel.
            --nfs                               Use NFS roots (default).
            --tmpfs                             Use tmpfs roots.
         --set-default [OPTION] IMAGE KERNEL    Set default image, kernel, rootfs type.
            --nfs                               Use NFS roots (default).
            --tmpfs                             Use tmpfs roots.
         --show-default                         Show default image, kernel, rootfs type.
         --add-db IMAGE                         Add image and its kernels to the db.
         --del-db IMAGE                         Delete image and its kernels from db.
         --update-db IMAGE                      Short-cut for --del-db, then --add-db.
         --push-rack [OPTIONS] IMAGE RACK       Push or update image on rack(s).
            --force                             Bypass the booted nodes check, deletes.
            --update-only                       Skip files newer in dest, no delete.
            --quiet                             Turn off diagnostic information.
         --del-rack IMAGE RACK                  Delete an image from rack(s).
         --clone-image OIMAGE NIMAGE            Clone an existing image to a new image.
         --del-image [OPTIONS] IMAGE            Delete an existing image entirely.
            --quiet                             Turn off diagnostic information.
        
        RACK arguments take the format 'rX'
        NODE arguments take the format 'rXiYnZ'
        ROOTFS argument can be either 'nfs' or 'tmpfs'
        
        X, Y, Z can be single digits, a [start-end] range, or * for all matches.
        

        EXAMPLES

        Example 3-1. cimage Command Examples

        The following examples walk you through some typical cimage command operations.

        To list the available images and their associated kernels, perform the following:

        # cimage --list-images
        image: compute-sles11
                kernel: 2.6.27.19-5-carlsbad
                kernel: 2.6.27.19-5-default
        image: compute-sles11-1_7
                kernel: 2.6.27.19-5-default
        

        To list the compute nodes in rack 1 and the image and kernel they are set to boot, perform the following:

        # cimage --list-nodes r1
        r1i0n0: compute-sles11 2.6.27.19-5-default nfs
        r1i0n8: compute-sles11 2.6.27.19-5-default nfs

        The cimage command also shows the root filesystem type (nfs or tmpfs)

        To set the r1i0n0 compute node to boot the 2.6.27.19-5-smp kernel from the compute-sles11 image, perform the following: :

        # cimage --set compute-sles11 2.6.27.19-5-smp r1i0n0

        To list the nodes in rack 1 to see the changes set in the example above, perform the following:

        # cimage --list-nodes r1
        r1i0n0: compute-sles11 2.6.27.19-5-smp
        r1i0n1: compute-sles11 2.6.27.19-5-smp
        r1i0n2: compute-sles11 2.6.27.19-5-smp
        [...snip...]

        To set all nodes in all racks to boot the 2.6.27.19-5-smp kernel from the compute-sles11 image, perform the following:

        # cimage --set compute-sles11 2.6.27.19-5-smp r*i*n*
        
        
        

        To set two ranges of nodes to boot the 2.6.27.19-5-smp kernel, perform the following:

        # cimage --set compute-sles11 2.6.27.19-5-smp r1i[0-2]n[5-6] r1i[2-3]n[0-4] 

        To clone the compute-sles11 image to a new image (so that you can modify it) , perform the following:

        # cinstallman --create-image --clone --source compute-sles11 --image mynewimage
        Cloning compute-sles11 to mynewimage ... done

        The clone process adds the image and its kernels to the database.


        Note: If you have made changes to the compute node image and are pushing that image out to leader nodes, it is a good practice to use the cinstallman --refresh-bt --image { image} command to refresh the bittorrent tarball and torrent file for a compute node image. This avoids duplication by rsync when the image is pushed out to the leader nodes. For more information, see the cinstallman -h usage statement or “cinstallman Command”.


        To change to the cloned image created in the example, above, copy the needed rpms into the /var/lib/systemimager/images/mynewimage/tmp directory, use the chroot command to enter the directory and then install the rpms, perform the following:

        # cp *.rpm /var/lib/systemimager/images/mynewimage/tmp
        # chroot /var/lib/systemimager/images/mynewimage/ bash
        # rpm -Uvh /tmp/*.rpm

        If you make changes to the kernels in the image, you need to refresh the kernel database entries for your image, To do this, perform the following:

        # cimage --update-db mynewimage
        

        If you did not make changes to the kernels in the cloned image created in the example above, you can omit this step.

        To push new software images out to the compute blades in a rack or set of racks, perform the following:

        # cimage --push-rack mynewimage r*
        r1lead: install-image: mynewimage
        r1lead: install-image: mynewimage done.

        To list images in the database the kernels they contain, perform the following:

        # cimage --list-images
        
        image: compute-sles11
                kernel: 2.6.16.60-0.7-carlsbad
                kernel: 2.6.16.60-0.7-smp
        
        image: mynewimage
                kernel: 2.6.16.60-0.7-carlsbad
                kernel: 2.6.16.60-0.7-smp

        To set some compute nodes to boot an image, perform the following:

        # cimage --set mynewimage 2.6.16.60-0.7-smp r1i3n*

        You need to reboot the compute nodes to run the new images.

        Completely remove an image you no longer use, both from system admin controller and all compute nodes in all racks, perform the following:

        # cimage --del-image mynewimage
        r1lead: delete-image: mynewimage
        r1lead: delete-image: mynewimage done.


        Using cinstallman to Install Packages into Software Images

        The packages that make up SMC for Altix ICE, SGI Foundation, and the Linux distribution media, and any other media or custom repositories you have added reside in repositories. The cinstallman command looks up the list of all repositories and provides that list to the commands it calls out for its operation such as yume.


        Note: Always work with copies of software images.


        The cinstallman command can update packages within systemimager images. You may also use cinstallman to install a single package within an image.

        However, cinstallman and the commands it calls only works with the configured repositories. So if you are installing your own RPM, you will need that package to be part of an existing repository. You may use the crepo command to create a custom repository into which you can collect custom packages.


        Note: The yum command maintains a cache of the package metadata. If you just recently changed the repositories, yum caches for the nodes or images you are working with may be out of date. In that case, you can issue the yum command "clean all" with --yum-node and --yum-image. The cinstallman command --update-node and --update-image options do this for you.


        The following example shows how to install the zlib-devel package in to the service node image so that the next time you image or install a service node, it will have this new package.
        # cinstallman --yum-image --image my-service-sles11 install zlib-devel

        You can perform a similar operation for compute node images. Note the following:

        • If you update a compute node image on the system admin controller (admin node), you have to use the cimage command to push the changes. For more information on the cimage command, see “cimage Command”.

        • If you update a service node image on the admin node, that service node needs to be reinstalled and/or reimaged to get the change. The discover command can be given an alternate image or you may use the cinstallman --assign-image command followed by the cinstallman --next-boot command to direct the service node to reimage itself with a specified image the next time it boots.

        Using yum to Install Packages on Running Service or Leader Nodes


        Note: These instructions only apply to managed service nodes and leader nodes. They do not apply to compute nodes.


        You can use the yum command to install a package on a service node. From the admin node, you can issue a command similar to the following:

        # cinstallman --yum-node --node service0 install zlib-devel


        Note: To get all service nodes, replace service0 with service\*.


        For more information on the cinstallman command, see “cinstallman Command”.

        Creating Compute and Service Node Images Using the cinstallman Command

        You can create service node and compute node images using the cinstallman command. This generates a root directory for images, automatically.

        Fresh installations of SMC for Altix ICE create these images during the configure-cluster installation step (see “Installing SMC for Altix ICE Admin Node Software ” in Chapter 2).

        The RPM lists that drive which packages get installed in the images are listed in files located in /etc/opt/sgi/rpmlists. For example, /etc/opt/sgi/rpmlists/compute-sles11.rpmlist (see “crepo Command”). You should NOT edit the default lists. These default files are recreated by the crepo command when repositories are added or removed. Therefore, you should only use the default RPM lists as a model for your own.


        Note: The procedure below uses SLES.


        Procedure 3-4. Using the cinstallman Command to Create a Service Node Image:

          To create a service node image using the cinstallman command, perform the following steps:

          1. Make a copy of the example service node image RPM list and work on the copy, as follows:

            # cp /etc/opt/sgi/rpmlists/service-sles11.rpmlist
            /etc/opt/sgi/rpmlists/my-service-node.rpmlist

          2. Add or remove any packages from the RPM list. Keep in mind that needed dependencies are pulled in automatically.

          3. Use the cinstallman command with the --create-image option to create the images root directory, as follows:

            # cinstallman --create-image --image my-service-node-image --rpmlist
            /etc/opt/sgi/rpmlists/my-service-node.rpmlist

            This example uses my-service-node-image as the home/name of the image.

            Output is logged to /var/log/cinstallman on the admin node.

          4. After the cinstallman comand finishes, the image is ready to be used with service nodes. You can supply this image as an optional image name to the discover command, or you may assign an existing service node to this image using the cinstallman --assign-image command. You can tell a service node to image itself next reboot by using the cinstallman --next-boot option.

          Procedure 3-5. Use the cinstallman Command to Create a Compute Node Image

            To create a compute node image using the cinstallman command, perform the following steps:

            1. Make a copy of the compute node image RPM list and work on the copy, as follows:

              # cp /etc/opt/sgi/rpmlists/compute-sles11.rpmlist
              /etc/opt/sgi/rpmlists/my-compute-node.rpmlist 

            2. Add or remove any packages from the RPM list. Keep in mind that needed dependencies are pulled in automatically.

            3. Run the cinstallman command to create the root, as follows:

              # cinstallman --create-image --image my-compute-node-image --rpmlist
              /etc/opt/sgi/rpmlists/my-compute-node.rpmlist

              This example uses the name name my-compute-node-image as the name.

              Output is logged to /var/log/cinstallman on the admin node.

              The cinstallman command makes the new image available to the cimage command.

            4. For information on how to use the cimage command to push this new image to rack leader controllers (leader nodes), see “cimage Command”.

            Installing a Service Node with a Non-default Image

            If you have a non-default service node image you wish to install on a service node, you have two choices, as follows:

            • Specify the image name when you first discover the node with the discover command.

            • Use the cinstallman command to associate an image with a service node, then set up the node to reinstall itself the next time it boots.

            The following example shows how to associate a custom image at discover time:

            # discover --service 2,image=my-service-node-image

            The next example shows how to reinstall an already discovered service node with a new image:
            # cinstallman --assign-image --node service2 --image my-service-node-image
            # cinstallman --next-boot image --node service2

            When you reboot the node, it will reinstall itself.

            For more information on the discover command, see “discover Command” in Chapter 2. For more information on the cinstallman command, see “cinstallman Command”.

            Retrieving a Service Node Image from a Running Service Node

            To retrieve a service node image from a running service node, perform the following steps:

            1. As root user , log into the service node from which you wish to retrieve an image. You can use the si_prepareclient(8) program to extract an image. Start the program, as follows:

              service0:~ # si_prepareclient --server admin
              Welcome to the SystemImager si_prepareclient command. This command may modify
              the following files to prepare your golden client for having its image 
              retrieved by the imageserver.  It will also create the /etc/systemimager 
              directory and fill it with information about your golden client. All modified
              files will be backed up with the .before_systemimager-3.8.0 extension.
              
               /etc/services:
                 This file defines the port numbers used by certain software on your system.
                 Entries for rsync will be added if necessary.
              
               /tmp/filetlOeP5:
                 This is a temporary configuration file that rsync needs on your golden client
                 in order to make your filesystem available to your SystemImager server.
              
               inetd configuration:
                 SystemImager needs to run rsync as a standalone daemon on your golden client
                 until its image is retrieved by your SystemImager server.  If rsyncd is 
                 configured to run as a service started by inetd, it will be temporarily
                 disabled, and any running rsync daemons or commands will be stopped.  Then,
                 an rsync daemon will be started using the temporary configuration file
                 mentioned above.
                 
              See "si_prepareclient --help" for command line options.
              
              Continue? (y/[n]): 

              Enter y to continue. After a few moments, you are returned to the command prompt. You are now ready to retrieve the image from the admin node.

            2. Exit the service0 node, and as root user on the admin node, perform the following command: (Replace the image name and service node name, as needed.)

              admin # mksiimage --Get --client service0 --name myimage

              It now retrieves the image. No progress information is provided. It takes several minutes depending on the size of the image on the service node.

            3. Use the cinstallman command to register the newly collected image:

              admin # cinstallman --create --use-existing --image myimage

            4. If you want to discover a node using this image directly, you can use the discover command, as follows:

              admin # discover --service 0,image=myimage

            5. If you want to re-image an already discovered node with your new image, run the following commands:

              # cinstallman --assign-image --node service0 --image myimag
              # cinstallman --next-boot image --node service0
              

            6. Reboot the service node.

            Using a Custom Repository for Site Packages

            This section describes how to maintain packages specific to your site and have them available to the crepo command (see “crepo Command”).

            SGI suggests putting site-specific packages in a separate location. They should not reside in the same location as SGI or Novell supplied packages.

            Procedure 3-6. Setting Up a Custom Repository for Site Packages

              To set up a custom repository for your custom packages, perform the following steps:

              1. Create directory for your site-specific packages on the system admin controller (admin node), as follows:

                # mkdir -p /tftpboot/site-local/sles-10-x86_64

              2. Copy your site packages in to the new directory, as follows:

                # cp my-package-1.0.x86_64.rpm /tftpboot/site-local/sles-10-x86_64

              3. Register your custom repository using the crepo command. This command will ensure your repository is consulted when the cinstallman command performs its operations. This command also creates the necessary yum/ repomd metadata.

                # crepo --add /tftpboot/site-local/sles-10-x86_64 --custom my-repo

                Your new repository may be consulted by cinstallman command operations going forward including updating images, nodes, and creating images.

              4. If you wish this repository to be used by cinstallman by default, you need to select it. Use the following command:

                # crepo --select my-repo

              5. If you use cinstallman to create an image, you will want to add your custom package to the rpmlist you use with the cinstallman command (see “Using cinstallman to Install Packages into Software Images”).

              SGI Altix ICE System Configuration Framework

              All node types that are part of an SGI Altix ICE system can have configuration settings adjusted by the configuration framework. There is some overlap between the per-host customization instructions and the configuration framework instructions. Each approach plays a role in configuring your system. The major differences between the two methods are, as follows:

              • Per-host customization runs at the time an image is pushed to the rack leader controllers.

              • Per-host customization only applies to compute node images.

              • The Altix ICE system configuration framework can be used with all node types.

              • The system configuration framework is run when a new root is created, when SuSEconfig command is run for some other reason, as part of a yum operation, or when new compute images are pushed with the cimage command.

              This framework exists to make it easy to adjust configuration items. There are SGI-supplied scripts already present. You can add more scripts as you wish. You can also exclude scripts from running without purging the script if you decide a certain script should not be run. The following set of questions in bold and bulleted answers describes how to use the system configuration framework.

              How does the system configuration framwork operate?

              These files could be added, for example, to a running service node, or to an already created service or compute image. Remember that images destined for compute nodes need to be pushed with the cimage command after being altered. For more information, see “cimage Command”.

              • A /opt/sgi/lib/cluster-configuration script is called, from where it is called is described below.

              • That script iterates through scripts residing in /etc/opt/sgi/conf.d.

              • Any scripts listed in /etc/opt/sgi/conf.d/exclude are skipped, as are scripts, that are not executable.

              • Scripts in system configruation framework must be tolerant of files that do not exist yet, as described below. For example, check that a syslog configuration file exists before trying to adjust it.

              • Scripts ending in a distro name, or a distro name with a specific distro version are only run if the node in question is running that distro. For example, /etc/opt/sgi/conf.d/99-foo.sles would only run if the node was running sles. This example shows the precedence of operations:If you had 88-myscript.sles10, 88-myscript.sles, and 88-myscript

                • On a sles10 system, 88-myscript.sles10 would execute

                • On a sles system that is not sles10, 88-myscript.sles would execute

                • On all other distros, 88-myscript would would execute

              • If you wish to make a custom version of an script supplied by SGI, you may simply name it with ".local" and the local version will run in place of the one supplied by SGI. This allows for customization without modifying scripts supplied by SGI. Scripts ending in .local have the highest precedence. In other words, if you had 88-myscript.sles, and 88-myscript.local, then 88-myscript.local would execute in all cases and the other 88-myscript scripts would never execute.

              From where is the framework called?

              • The callout for /opt/sgi/lib/cluster-configuration is implemented as a yum plugin that executes after packages have been installed and cleaned.

              • On SLES only, there is also a SUSE configuration script in the /sbin/conf.d directory, called SuSEconfig.00cluster-configuration , that calls the framework. This is in case of you are using YaST to install or upgrade packages.

              • On SLES only, one of the scripts called by the framework calls SuSEconfig. A check is made to avoid a callout loop.

              • The framework is also called when the admin, leader, or service nodes start up. The call is made just after networking is configured. As a site administrator, you could create custom scripts here that check on or perform certain configuration operations.

              • When using the cimage command to push a compute node root image to rack leaders, the configuration framework executes within the chroot of the compute node image after it is pulled from the admin node to the rack leader node.

              How do I adjust my system configuration?

              • Create a small script in /etc/opt/sgi/conf.d to do the adjustment.

                Be sure that you test for existence of files and do not assume they are there (see "Why do scripts need to tolerate files that do not exist but should?" below).

              Why do scripts need to tolerate files that do not exist but should?

              • This is because the mksiimage command runs yume and yum in two steps. The first step only installs 40 or so RPMs but our framework is called then too. The second pass installs the other "hundreds" of RPMs. So the framework is called once before many packages are installed, and again after everything is in place. So not all files you expect might be available when your small script is called.

              How does the yum plugin work?

              • In order for the yum plugin to work, the /etc/yum.conf file has to have plugins=1 set in its configuration file. SMC for Altix ICE software ensures that is in place by way of a trigger in the sgi-cluster package. Any time yum is installed or updated, it verify plugins=1 is set.

              How does yume work?

              • yume, an oscar wrapper for yum, works by creating a temporary yum configuration file in /tmp and then points yum at it. This temporary configuration file needs to have plugins enabled. A tiny patch to yume makes this happen. This fixes it for yume and also mksiimage, which calls yume as part of its operation.

              Cluster Configuration Repository: Updates on Demand

              SMC for ICE contains a cluster configuration repository/update framework. This framework generates and distributes configuration updates to admin, service, and leader nodes in the cluster. Some of the configuration files managed by this framework include C3 conserver, DNS, Ganglia, hosts files, and NTP.

              When an event occurs that requires these files to be updated, the framework executes on the admin node. The admin node stores the updated configuration framework in a special cached location and updates the appropriate nodes with their new configuration files.

              In addition to the updates happening as required, the configuration file repository is consulted when a admin, service, or leader node boots. This happens shortly after networking is started. Any configuration files that are new or updated are transferred at this early stage so that the node is fully configured by the time the node is fully operational.

              There are no hooks for customer configuration in the configuration repository at this time.

              This update framework is tied in with the /etc/opt/sgi/conf.d configuration framework to provide a full configuration solution. As mentioned earlier, customers are encouraged to create /etc/opt/sgi/conf.d scripts to do cluster configuration.

              cnodes Command

              The cnodes command provides information about the types of nodes in your system. For help information, perform the following:

              [admin ~]# cnodes --help
              Options:
               --all                  all compute, leader and service nodes, and switches
               --compute              all compute nodes
               --leader               all leader nodes
               --service              all service nodes
               --switch               all switch nodes
               --online               modifier: nodes marked online
               --offline              modifier: nodes marked offline
               --managed              modifier: managed nodes
               --unmanaged            modifier: unmanaged nodes
               --smc-for-ice-names    modifier: return SMC-for-ICE node names instead of hostnames
              
              Note: default modifiers are 'online' and 'managed' unless otherwise specified.

              EXAMPLES

              Example 3-2. cnodes Example

              The following examples walk you through some typical cnodes command operations.

              To see a list of all nodes in your system, perform the following:

              [admin ~]# cnodes --all
              r1i0n0
              r1i0n1
              r1lead
              service0

              To see a list of all compute nodes, perform the following:
              [admin ~]# cnodes --compute
              r1i0n0
              r1i0n1

              To see a list of service nodes, perform the following:
              [admin ~]#  cnodes --service
              service0


              Multi-disto Image Management

              By default, SMC for Altix ICE software associates one software distribution (distro) with all the images and nodes in the system. For example, if RHEL 6 is used for the admin node then, by default, RHEL 6 is used for the compute blades, leader nodes, and service nodes.

              However, SMC for Altix ICE software allows support for multiple distros for compute nodes and service nodes. This means that the nodes and images for service and compute nodes need not match the Linux distribution running on admin/leader nodes.

              The following information is intended to make it easier for you to see which media goes with which distributions.

              • RHEL 6

                Required:

                • SGI Foundation Software 2.3

                • SMC for Altix ICE 1.0

                • Red Hat Enterprise Linux 6 Install DVD

                Optional:

                • SGI MPI 1.0

                • SGI Accelerate 1.0

              • RHEL 5.5

                Required:

                • SGI Foundation 1 Service Pack 6

                • SMC for Altix ICE 1.0

                • Red Hat Enterprise Linux 5.5 Install DVD

                Optional:

                • SGI ProPack 6 Service Pack 6

              • SLES 11 SP1

                Required:

                • SGI Foundation Software 2.3

                • SMC for Altix ICE 1.1

                • SUSE Linux Enterprise Server 11 SP1 Install DVD #1

                Optional:

                • SGI MPI 1.1

                • SGI Accelerate 1.1

              • SLES 10 SP3

                Required:

                • SGI Foundation Software 1 Service Pack 6

                • SMC for Altix ICE 1.1

                • SUSE Linux Enterprise Server 10 SP3 Install DVD #1

                Optional:

                • SGI ProPack 6 Service Pack 6

              The crepo command, described in “crepo Command”, is the starting point for multi-distro support.

              Here is an example of the commands you might run in order to create a RHEL 5.5 service and compute node image:

              • First, make sure no repositories are currently selected, as follows:

                # crepo --show

                For any repositories from the result above, that is marked as selected, run this command to unselect it, as follows:

                # crepo --unselect repository name

              • Next, register the repositories for RHEL 5.5. You could point the crepo command at an ISO image or at the mounted media. The ISO file names may not exactly match what you downloaded. In this example, we include optional TBD.

                # crepo --add foundation-2.3-cd1-media-rhel5-x86_64.iso
                # crepo --add TBD-cd1-media-rhel5-x86_64.iso
                # crepo --add RHEL5.5-Server-20100322.0-x86_64-DVD.iso
                # crepo --add smc-1.0-cd1-media-rhel5-x86_64.iso

              • Now, select all of the repositories you just added. Use crepo --show to find the names to use.

                # crepo --select SGI-Foundation-Software-1SP6-rhel5
                # crepo --select SGI-TBD-for-Linux-rhel5
                # crepo --select smc-1.0-rhel5
                # crepo --select Red-Hat-Enterprise-Linux-Server-5.5
                

              • Now, create images:

                # cinstallman --create-image --image service-rhel55 --rpmlist /etc/opt/sgi/rpmlists/generated-service-rhel5.5.rpmlist
                # cinstallman --create-image --image compute-rhel55 --rpmlist /etc/opt/sgi/rpmlists/generated-compute-rhel5.5.rpmlist

              • For the service node, you are now ready to image a node. If the node is not yet discovered, use the discover command with the image= parameter. If the node is already discovered and you wish to re-install it, use the cinstallman --assign-image and cinstallman --next-boot operations to assign the new image to the node in question and mark it for installation on next boot. You can reset the service node and it will install itself.

              • For the compute image, you need to also push it to the racks. For example:

                # cimage --push-rack compute-rhel55 r\* 

              • You can then use the cimage --set operation to associate compute blades with the new image.

              • Reboot or reset the compute nodes associated with the new image.

              Power Management Commands

              The cpower command allows you to power up, power down, reset, and show the power status of system components.

              cpower Command

              The cpower command is, as follows:

              cpower [<option> ...] [<target_type>] [<action>] <target>

              The <option> argument can be one or more of the following:

              Option 

              Description

              --noleader 

              Do not include leader nodes (valid with rack and system domains only).

              --noservice 

              Do not include service nodes (valid with system domain only).

              --force 

              When using wildcards in the target, disable all “safety” checks. Make sure you really want to use this command.

              -n, --noexec 

              Displays, but does not execute, commands that affect power.

              -v, --verbose 

              Print additional information on command progress

              The <target> argument is one of the following:

              --node 

              Applies the action to nodes. Nodes are compute nodes, rack leader controllers (leader nodes), system admin controller (admin node), and service nodes. [default]

              --iru 

              Applies the action at the IRU level.

              --rack 

              Applies the action at the rack level.

              --system 

              Applies the action to the system. You must not specify a target with this type.

              The <action> argument is one of the following:

              --status 

              Show the power status of the target, including whether it is booted or not. [default]

              --up | --on 

              Powers up the target.

              --down | --off 

              Powers down the target.

              --reset 

              Performs a hard reset on the target.

              --cycle 

              Power cycles the target.

              --boot 

              Boots up the target, unless it is already booted. Waits for all targets to boot.

              --reboot 

              Reboots the target, even if already booted. Wait for all targets to boot.

              --halt 

              Halts and then powers off the target.

              --shutdown 

              Shuts down the target, but does not power it off. Waits for targets to shut down.

              --identify <interval> 

              Turns on the identifying LED for the specified interval in seconds. Uses an interval of 0 to turn off immediately.

              -h, --help 

              Shows help usage statement.

              The target must always be specified except when the --system option is used. Wildcards may be used, but be careful not to accidentally power off or reboot the leader nodes. If wildcard use affects any leader node, the command fails with an error.

              Operations on Nodes

              The default for the cpower command is to operate on system nodes, such as compute nodes, leader nodes, or service nodes. If you do not specify --iru, --rack, or --system, the command defaulted to operating as if you had specified --node.

              Here are examples of node target names:

              • r1i3n10

                Compute node at rack 1, IRU 3, slot 10

              • service0

                Service node 0

              • r3lead

                Rack leader controller (leader node) for rack 3

              • r1i*n*

                Wildcards let you specify ranges of nodes, for example, r1i*n* all compute nodes in all IRUs on rack 1

              IPMI-style Commands

              The default operation for the cpower command is to operate on nodes and to provide you the status of these nodes, as follows:

              # cpower r1i*n*
              

              This command is equivalent to the following:

              # cpower --node --status r1i*n*
              

              This command issues an ipmitool power off command to all of the nodes specified by the wildcard, as follows:

              # cpower --off r2i*n*  

              The default is to apply to a node.

              The following commands behave exactly as you would expect as if you were using ipmitool, and have no special extra logic for ordering:

              # cpower --up r1i*n*

              # cpower --reset r1i*n*

              # cpower --cycle r1i*n*

              # cpower --identify 5 r1i*n*


              Note: --up is a synonym for --on and --down is a synonym for --off.


              IRU, Rack, and System Domains

              The cpower command contains more logic when you go up to higher levels of abstraction, for example, using --iru, --rack, and --system. These higher level domain specifiers tell the command to be smart about how to order various of the actions that you give on the command line.

              The --iru option tells the command to use correct ordering with IRU power commands. In this case, it firsts connect to the CMC on each IRU in rack 1 to issue the power on command, which turns on power to the IRU chassis (this is not the equivalent ipmitool command). Then it powers up the compute nodes in the IRU. Powering things down is the opposite, with the power to the IRU being turned off after power to the blades. IRU targets are specified as follows: r3i2 for rack 3, IRU 2.

              # cpower --iru --up r1i* 

              The --rack option ensures power commands to the leader node are down in the correct order relative to compute nodes within a rack. First, it powers up the leader node and waits for it to boot up (if it is not already up). Then it will do the functional equivalent of a cpower --iru --up r4i* on each of the IRUs contained in the rack, including applying power to each IRU chassis. Using the --down option is the opposite, and also turns off the leader node (after doing a shutdown) after all the IRUs are powered down. To avoid including leader nodes in a power command for a rack, use the --noleader option. Rack targets are specified, as follows: r4 for rack 4. Here is an example:

              # cpower --rack --up r4

              Commands with the --system option ensures that power up commands are applied first to service nodes, then to leader nodes, then to IRUs and compute blades, in just the same way. Likewise, compute blades are powered down before IRUs, leader nodes, and service nodes, in that order. To avoid including service nodes in a system-domain command, use the --noservice option. Note that you must not specify a target with --system option, since it applies to the Altix ICE system.

              Shutting Down and Booting


              Note: The --shutdown --off combination of actions were deprecated in a previous release. Use the --halt option in its place.


              It is useful to be able to shutdown a machine before turning off the power, in most cases. The following cpower options to enable you to do this: --halt, --boot, and --reboot. The --halt option allows you to shut down a node. The --reboot option ensures that a system is always rebooted, whereas --boot will only boot up a system if it is not already booted. Thus, --boot is useful for booting up compute blades that have failed to start.

              You need to configure the order in which service nodes are booted up and shut down as part of the overall system power management process. This is done by setting a boot_order for each service node. Use the cadmin command to set the boot order for a service node, for example:

              # cadmin --set-boot-order --node service0 2

              The cpower --system --boot command boots up service nodes with a lower boot order, first. It then boots up service nodes with a higher boot order. The reverse is true when shutting down the system with cpower. For example, if service1 has a boot order of 3 and service2 has a boot order of 5, service1 is booted completely, and then service2 is booted, afterwards. During shutdown, service2 is shut down completely before service1 is shutdown.

              There is a special meaning to a service node having a boot order of zero. This value causes the cpower --system command to skip that service node completely for both start up and shutdown (although not for status queries). Negative values for the service node boot order setting are not permitted.


              Note: The IPMI power commands necessary to enable a system to boot (either with a power reset, or a power on) may be sent to a node. The --halt option, halts the target node and then powers it off.


              The --halt options works on node, IRU, or rack domain levels. It will shut down nodes (in the correct order if you use the --iru or --rack options), and then just leave them as they are, power still applied. Using both these actions results in nodes being halted, then powered off. This is particularly useful when powering off a rack, since otherwise, the leaders may be shutdown before there is a chance to power off the compute blades. Here is an example:

              # cpower --halt --rack r1

              To boot up systems that have not already been booted, perform the following:

              # cpower --boot  r1i2n*

              Again, the command boots up nodes in the right orders if you specify the --iru or --rack options and the appropriate target. Otherwise, there is no guarantee that, for example, the command will attempt to power on the leader node before compute nodes in the same rack.

              To reboot all of the nodes specified, or boot them if they are already shut down, perform the following:

              # cpower --reboot --iru r3i3

              The --iru or --rack options ensure proper ordering if you use them. In this case, the command will make sure that power is supplied to the chassis for rack 3, IRU 3, and then the all the compute nodes in that IRU will be rebooted.

              EXAMPLES

              Example 3-3. cpower Command Examples

              To boot compute blade r1i0n8, perform the following:

              # cpower --boot r1i0n8
              

              To boot a number of compute blades at the same time, perform the following:

              # cpower --boot --rack r1


              Note: The --boot option will only boot those nodes that have not already booted.


              To shut down service node 0, perform the following:

              # cpower --halt service0

              To shutdown and switch off everything in rack 3, perform the following:

              # cpower --halt --rack r3


              Note: This command will shutdown and then power off all of the computer nodes in parallel, then shutdown and power off the leader node. Use the --noleader option if you want the leader node to remain booted up.


              To shutdown the entire system, including all service nodes and all leader nodes, but not the admin node, and not turn the power off to anything, perform the following:

              # cpower --halt --system

              To shutdown all the compute nodes, but not the service nodes, leader nodes, perform the following:

              # cpower --halt --system --noleader --noservice


              Note: The only way to shut down the system admin controller (admin node) is to perform the operation manually.



              C3 Commands


              Note: For legacy Altix ICE systems, this section remains intact. However, SGI recommends you use the pdsh and pdcp utilities described in “pdsh and pdcp Utilities”.


              This section describes the cluster command and control (C3) tool suite for cluster administration and application support.

              Note: The SMC for Altix ICE version of C3 does not include the cshutdown and cpushimage commands.


              The C3 commands used on the the SGI Alitx ICE 8200 system are, as follows:

              C3 Utilities 

              Description

              cexec(s) 

              Executes a given command string on each node of a cluster

              cget 

              Retrieves a specified file from each node of a cluster and places it into the specified target directory

              ckill 

              Runs kill on each node of a cluster for a specified process name

              clist 

              Lists the names and types of clusters in the cluster configuration file

              cnum 

              Returns the node names specified by the range specified on the command line

              cname 

              Returns the node positions specified by the node name given on the command line

              cpush 

              Pushes files from the local machine to the nodes in your cluster

              cexec is the most useful C3 utility. Use the cpower command rather than cshutdown (see “Power Management Commands”).

              EXAMPLES

              Example 3-4. C3 Command General Examples

              The following examples walk you through some typical C3 command operations.

              You can use the cname and cnum commands to map names to locations and vice versa, as follows:

              # cname rack_1:0-2
              local name for cluster:  rack_1
              nodes from cluster:  rack_1
              cluster:  rack_1 ; node name:  r1i0n0
              cluster:  rack_1 ; node name:  r1i0n1
              cluster:  rack_1 ; node name:  r1i0n10
              
              # cnum rack_1: r1i0n0
              local name for cluster:  rack_1
              nodes from cluster:  rack_1
              r1i0n0 is at index 0 in cluster rack_1
              
              # cnum rack_1: r1i0n1
              local name for cluster:  rack_1
              nodes from cluster:  rack_1

              You can use the clist command to retrieve the number of racks, as follows:

              # clist
              cluster  rack_1  is an indirect remote cluster
              cluster  rack_2  is an indirect remote cluster
              cluster  rack_3  is an indirect remote cluster
              cluster  rack_4  is an indirect remote cluster

              You can use the cexec command to view the addressing scheme of the C3 utility, as follows:

              # cexec rack_1:1 hostname
              ************************* rack_1 *************************
              ************************* rack_1 *************************
              --------- r1i0n1---------
              r1i0n1
              
              # cexec rack_1:2-3 rack_4:0-3,10 hostname
              ************************* rack_1 *************************
              ************************* rack_1 *************************
              --------- r1i0n10---------
              r1i0n10
              --------- r1i0n11---------
              r1i0n11
              ************************* rack_4 *************************
              ************************* rack_4 *************************
              --------- r4i0n0---------
              r4i0n0
              --------- r4i0n1---------
              r4i0n1
              --------- r4i0n10---------
              r4i0n10
              --------- r4i0n11---------
              r4i0n11
              --------- r4i0n4---------
              r4i0n4
              

              The following set of command shows how to use the C3 commands to transverse the different levels of hierarchy in your Altix ICE system (for information on the hierarchical design of your Altix ICE system see “Basic System Building Blocks” in Chapter 1).

              To execute a C3 command on all blades within the default Altix ICE system, for example, rack 1, perform the following:

              # cexec hostname
              ************************* rack_1 *************************
              ************************* rack_1 *************************
              --------- r1i0n0---------
              r1i0n0
              --------- r1i0n1---------
              r1i0n1
              --------- r1i0n10---------
              r1i0n10
              --------- r1i0n11---------
              r1i0n11
              
              ...

              To run a C3 command on all compute nodes across an Altix ICE system, perform the following:

              # cexec --all hostname
              ************************* rack_1 *************************
              ************************* rack_1 *************************
              --------- r1i0n0---------
              r1i0n0
              --------- r1i0n1---------
              r1i0n1
              ...
              --------- r2i0n10---------
              r2i0n10
              ...
              --------- r3i0n11---------
              r3i0n11
              ...

              To run a C3 command against the first rack leader controller, in the first rack, perform the following:

              # cexec --head hostname
              ************************* rack_1 *************************
              --------- rack_1---------
              r1lead
              

              To run a C3 command against all rack leader controllers across all racks, perform the following:

              # cexec --head --all hostname
              ************************* rack_1 *************************
              --------- rack_1---------
              r1lead
              ************************* rack_2 *************************
              --------- rack_2---------
              r2lead
              ************************* rack_3 *************************
              --------- rack_3---------
              r3lead
              ************************* rack_4 *************************
              --------- rack_4---------
              r4lead

              The following set of examples shows some specific case uses for the C3 commands that you are likely to employ.

              Example 3-5. C3 Command Specific Use Examples

              From the system admin controller, run command on rack 1 without including the rack leader controller, as follows:

              # cexec rack_1: <cmd>

              Run a command on all service nodes only, as follows:

              # cexec -f /etc/c3svc.conf <cmd>

              Run a command on all compute nodes in the system, as follows:

              # cexec --all <cmd>

              Run a command on all rack leader controllers, as follows:

              # cexec --all --head <cmd>

              Run a command on blade 42 (compute node 42) in rack 2, as follows:

              # cexec rack_2:42 <cmd>

              From a service node over the InfiniBand Fabric, run a command on all blades (compute nodes) in the system, as follows:

              # cexec --all <cmd>

              Run a command on blade 42 (compute node 42), as follows:

              # cexec blades:42 <cmd> 


              pdsh and pdcp Utilities

              The pdsh(1) command is the parallel shell utility. The pdcp(1) command is the parallel copy/fetch utility. The SMC for Altix ICE software populates some dshgroups files for the various node types. On the admin node, SMC for Altix ICE software populates the leader and service groups files, which contain the list of online nodes in each of those groups.

              On the leader node, software populates the compute group for all the online compute nodes in that group.

              On the service node, software populates the compute group which contains all the online compute nodes in the whole system.

              For more information, see the pdsh(1) and pdcp(1) man pages.

              EXAMPLES

              From the admin node, to run the hostname command on all the leader nodes, perform the following:

              # pdsh -g leader hostname

              To run the hostname command on all the compute nodes in the system, via the leader nodes, perform the following:

              # pdsh -g leader pdsh -g compute hostname

              To run the hostname command on just r1lead and r2lead, perform the following:

              # pdsh -w r1lead,r2lead hostname

              cadmin: SMC for Altix ICE Administrative Interface

              The cadmin command allows you to change certain administrative parameters in the cluster such as the boot order of service nodes, the administrative status of nodes, and the adding, changing, and removal of IP addresses associated with service nodes.

              To get the cadmin usage statement, perform the following:

              # cadmin --h
              cadmin: SMC for Altix ICE Administrative Interface
              Help:
              
              In general, these commands operate on {node}.  {node} is the SMC for Altix ICE style
              node name.  For example, service0, r1lead, r1i0n0.  Even when the host name
              for a service node is changed, the SMC for Altix ICE name for that node may still be used
              for {node} below.  The node name can either be the SMC for Altix ICE unique node name
              or a customer-supplied host name associated with a SMC for Altix ICE unique node name.
              
              --version : Display current release information
              --set-admin-status --node {node} {value}  : Set Administrative Status
              --show-admin-status --node {node} : Show Administrative Status
              --set-boot-order --node {node} [value] : Set boot order [1]
              --show-boot-order --node {node} : Show boot order [1]
              --set-ip --node {node} --net {net} {hostname}={ip} : Change an allocated ip [1]
              --del-ip --node {node} --net {net} {hostname}={ip} : Delete an ip [1]
              --add-ip --node {node} --net {net} {hostname}={ip} : allocate a new ip [1]
              --show-ips --node {node} : Show all allocated IPs associated with node
              --set-hostname --node {node} {new-hostname} : change the host name [5]
              --show-hostname --node {node} : show the current host name for ice node {node}
              --set-subdomain {domain} : Set the cluster subdomain [3]
              --show-subdomain : Show the cluster subdomain
              --set-admin-domain {domain} : Set the admin node house network domain
              --show-admin-domain : Show the admin node house network domain
              --db-purge --node {node} : Purge service or lead node (incl entire rack) from DB
              --set-external-dns --ip {ip} : Set IP addr(s) of external DNS master(s) [4]
              --show-external-dns : Show the IP addr(s) of the external DNS master(s)
              --del-external-dns : Delete the configuration of external DNS master(s)
              --show-root-labels : Show grub root labels if multiple roots are in use
              --set-root-label --slot {#} --label {label} : Set changeable part of root label
              --show-default-root : Show default root if multiple roots are in use
              --set-default-root --slot {#} : Set the default slot if multiple roots in use
              --show-current-root : Show current root slot
              --enable-auto-recovery : Enable ability for nodes to recover themselves [6]
              --disable-auto-recovery : Disable auto recovery [6]
              --show-auto-recovery : Show the current state of node auto recovery [6]
              --set-redundant-mgmt-network --node {node} {value}:  Configure network
                  management redundancy; valid values are "yes" and "no".
              --show-redundant-mgmt-network --node {node}:  Show current value.
              --show-dhcp-option: Show admin dhcp option code used to distinguish mgmt network
              --set-dhcp-option {value}: Set admin dhcp option code
              
              Node-attribute options:
              --add-attribute [--string-data "{string}"] [--int-data {int}] {attribute-name}
              --is-attribute {attribute-name}
              --delete-attribute {attribute-name}
              --set-attribute-data [--string-data "{string}"] [--int-data {int}]
                {attribute-name}
              --get-attribute-data {attribute-name}
              --search-attributes [--string-data "{string|regex}"] [--int-data {int}]
              --add-node-attribute [--string-data "{string}"] [--int-data {int}]
                --node {node} --attribute {attribute-name}
              --is-node-attribute --node {node} --attribute {attribute-name}
              --delete-node-attribute --node {node} --attribute {attribute-name}
              --set-node-attribute-data [--string-data "{string}"] [--int-data {int}]
                --node {node} --attribute {attribute-name}
              --get-node-attribute-data --node {node} --attribute {attribute-name}
              --search-node-attributes [--node {node}] [--attribute {attribute-name}]
                [--string-data "{string|regex}"] [--int-data {int}]
              
              Descriptions of Selected Values:
              {hostname}={ip} means specify the host name associated with the specified
              ip address.
              {net} is the SMC for Altix ICE network to change such as ib-0, ib-1, head, gbe, bmc, etc
              {node} is a SMC for Altix ICE-style node name such as r1lead, service0, or r1i0n0.
              [1] Only applies to service nodes
              [2] This operation may require the cluster to be fully shut down and AC power
                  to be removed.  IPs will have to be re-allocated to fit in the new range.
              [3] All cluster nodes will have to be reset
              [4] Use quoted, semi-colon separated list if more than one master
              [5] Only applies to admin and service nodes
              [6] Auto recovery will allow service and leader nodes to boot in to a special
                  recovery mode if the cluster doesn't recognize them.  This is enabled by
                  default and would be used, for example, if a node's main board was replaced
                  but the original system disks were imported from the original system.
              

              EXAMPLES

              Example 3-6. SMC for Altix ICE Administrative Interface (cadmin) Command

              Set a node offline, as follows:

              # cadmin --set-admin-status --node r1i0n0 offline

              Set a node online, as follows:

              # cadmin --set-admin-status --node r1i0n0 online

              Set the boot order for a service node, as follows:

              # cadmin --set-boot-order --node service0 2

              Add an IP to an existing service node, as follows:

              # cadmin --add-ip --node service0 --net ib-0 my-new-ib0-ip=10.148.0.200

              Change the SMC for Altix ICE needed service0-ib0 IP address, as follows:

              # cadmin --set-ip --node service0 --net head service0=172.23.0.199

              Show currently allocated IP addresses for service0, as follows:

              # cadmin --show-ips --node service0
              IP Address Information for SMC for Altix ICE node: service0
              
              ifname        ip               Network  
              
              myservice-bmc 172.24.0.3       head-bmc 
              myservice     172.23.0.3       head     
              myservice-ib0 10.148.0.254     ib-0     
              myservice-ib1 10.149.0.67      ib-1     
              myhost        172.24.0.55      head-bmc 
              myhost2       172.24.0.56      head-bmc 
              myhost3       172.24.0.57      head-bmc 

              Delete a site-added IP address (you cannot delete SMC for Altix ICE needed IP addresses), as follows:

              admin:~ # cadmin --del-ip --node service0 --net ib-0 my-new-ib0-2-ip=10.148.0.201

              Change the hostname associated with service0 to be myservice, as follows:

              admin:~ # cadmin --set-hostname --node service0 myservice

              Change the hostname associated with admin to be newname, as follows:

              admin:~ # cadmin --set-hostname --node admin newname

              Set and show the cluster subdomain, as follows:

              admin:~ # cadmin --set-subdomain mysubdomain.domain.mycompany.com
              admin:~ # cadmin --show-subdomain
              The cluster subdomain is: mysubdomain

              Show the admin node house network domain, as follows:

              admin:~ # cadmin --show-admin-domain
              The admin node house network domain is: domain.mycompany.com

              Show the SMC for Altix ICE systems DHCP option identifier, as follows:

              admin:~ # cadmin  --show-dhcp-option
              149


              Console Management

              SMC for Altix ICE management systems software uses the open-source console management package called conserver. For detailed information on consever, see http://www.conserver.com/

              An overview of the conserver package is, as follows:

              • Manages the console devices of all managed nodes in an Altix ICE system

              • A conserver daemon runs on the system admin controller (admin node) and the rack leader controllers (leader nodes). The system admin controller manages leader and service node consoles. The rack leader controllers manage blade consoles.

              • The conserver daemon connects to the consoles using ipmitool. Users connect to the daemon to access them. Multiple users can connect but non-primary users are read-only.

              • The conserver package is configured to allow all consoles to be accessed from the system admin controller.

              • All consoles are logged. These logs can be found at /var/log/consoles on the system admin controller and rack leader controllers. An autofs configuration file is created to allow you to access rack leader controller managed console logs from the system admin controller, as follows:

                admin #  cd /net/r1lead/var/log/consoles/ 

              The /etc/conserver.cf file is the configuration file for the conserver daemon. This file is generated for both the system admin controller and rack leader controllers from the /opt/sgi/sbin/generate-conserver-files script on the system admin controller. This script is called from discover-rack command as part of rack discovery or rediscovery and generates both the conserver.cf file for the rack in question and regenerates the conserver.cf for the sysem admin controller.


              Note: The conserver package replaces cconsole for access to all consoles (blades, leader nodes, managed service nodes)


              You may find the following conserver man pages useful:

              Man Page  

              Description

              console(1) 

              Console server client program

              conserver(8) 

              Console server daemon

              conserver.cf(5) 

              Console configuration file for conserver(8)

              conserver.passwd(5) 

              User access information for conserver(8)

              Procedure 3-7. Using conserver Console Manager

                To use the conserver console manager, perform the following steps:

                1. To see the list of available consoles, perform the following:

                  admin:~ # console -x
                   service0                 on /dev/pts/2                       at  Local 
                   r2lead                   on /dev/pts/1                       at  Local 
                   r1lead                   on /dev/pts/0                       at  Local 
                   r1i0n8                   on /dev/pts/0                       at  Local 
                   r1i0n0                   on /dev/pts/1                       at  Local                 

                2. To connect to the service console, perform the following:

                  admin:~ # console service0
                  [Enter `^Ec?' for help]
                  
                  
                  Welcome to SUSE Linux Enterprise Server 10 sp2 (x86_64) - Kernel 2.6.16.60-0.12-smp (ttyS1).
                  
                  
                  service0 login: 
                  


                3. To connect to the rack leader controller console, perform the following:

                  admin:~ # console r1lead
                  [Enter `^Ec?' for help]
                  
                  
                  Welcome to SUSE Linux Enterprise Server 10 sp2 (x86_64) 
                  - Kernel 2.6.16.60-0.12-smp (ttyS1).
                  
                  
                  r1lead login:

                4. To trigger system request commands sysrq (once connected to a console), perform the following:

                  Ctrl-e c l 1 8               # set log level to 8
                  Ctrl-e c l 1 <sysrq cmd>            # send sysrq command

                5. To see the list of conserver escape keys, perform the following:

                  Ctrl-e c ?

                Keeping System Time Synchronized

                The SMC for Altix ICE systems management software uses network time protocol (NTP) as the primary mechanism to keep the nodes in your Altix ICE system synchronized. This section describes this mechanism operates on the various Altix ICE components and covers these topics:

                System Admin Controller NTP

                When you used the configure-cluster command, it guided you through setting up NTP on the admin node. The NTP client on the system admin controller should point to the house network time server. The NTP server provides NTP service to system components so that nodes can consult it when they are booted. The system admin controller sends NTP broadcasts to some networks to keep the nodes in sync after they have booted.

                Rack Leader Controller NTP

                NTP client on the rack leader controller gets time from the system admin controller when it is booted and then stays in sync by connecting to the admin node for time. The NTP server on the leader node provides NTP service to Altix ICE components so that compute nodes can sync their time when they are booted. The rack leader controller sends NTP broadcasts to some networks to keep the compute nodes in sync after they have booted.

                Managed Service, Compute, and Leader BMC Setup with NTP

                The BMC controllers on managed service nodes, compute nodes, and leader nodes are also kept in sync with NTP. Note that you may need the latest BMC firmware for the BMCs to sync with NTP properly. The NTP server information for BMCs is provided by special options stored in the DHCP server configuration file.

                Service Node NTP

                The NTP client on managed service nodes ( for a definition of managed, see “discover Command” in Chapter 2) sets its time at initial booting from the system admin controller. It listens to NTP broadcasts from the system admin controller to stay in sync. It does not provide any NTP service.

                Compute Node NTP

                The NTP Client on the compute node sets its time at initial booting from the rack leader controller. It listens to NTP broadcasts from the rack leader controller to stay in sync.

                NTP Work Arounds

                Sometime, especially during initial deployment of an Altix ICE system when system components are being installed and configured for the first time, NTP is not available to serve time to system components.

                A non-modified NTP server, running for the first time, takes quite some time before it offers service. This means the leader and service nodes may fail to get time from the system admin controller as they come on-line. Compute nodes may also fail to get time from the leader when they first come up. This situation usually only happens at first deployment. After the ntp servers have a chance to create their drift files, ntp servers offer time with far less delay on subsequent reboots.

                The following work arounds are in place for situations when NTP can not serve the time:

                • The admin and rack leader controllers have the time service enabled (xinetd).

                • All system node types have the netdate command.

                • A special startup script is on leader, service, and compute nodes that runs before the NTP startup script.

                  This script attempts to get the time using the ntpdate command. If the ntpdate command fails because the NTP server it is using is not ready yet to offer time service, it uses the netdate command to get the clock "close".

                  The ntp startup script starts the NTP service as normal. Since the clock is known to be "close", NTP will fix the time when the NTP servers start offering time service.

                Changing the Size of /tmp on Compute Nodes

                This section describes how to change the size of /tmp on Altix ICE compute nodes.

                Procedure 3-8. Increasing the /tmp Size

                  To change the size of /tmp on your system compute nodes, perform the following steps:

                  1. From the admin node, change directory (cd) to /opt/sgi/share/per-host-customization/global.

                  2. Open the sgi-fstab file and change the size= parameter for the /tmp mount in both locations that it appears.

                    #!/bin/sh                                                                   
                    #                                                                           
                    # Copyright (c) 2007,2008 Silicon Graphics, Inc.                            
                    # All rights reserved.                                                      
                    #                                                                           
                    #  This program is free software; you can redistribute it and/or modify     
                    #  it under the terms of the GNU General Public License as published by     
                    #  the Free Software Foundation; either version 2 of the License, or        
                    #  (at your option) any later version.                                      
                    #                                                                           
                    #  This program is distributed in the hope that it will be useful,          
                    #  but WITHOUT ANY WARRANTY; without even the implied warranty of           
                    #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the            
                    #  GNU General Public License for more details.                             
                    #                                                                           
                    #  You should have received a copy of the GNU General Public License        
                    #  along with this program; if not, write to the Free Software              
                    #  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA 
                    #                                                                           
                    # Set up the compute node's /etc/fstab file.                                
                    #                                                                           
                    # Modify per your sites requirements.                                       
                    #                                                                           
                    # This script is excecuted once per-host as part of the install-image operation
                    # run on the leader nodes, which is called from cimage on the admin node.      
                    # The full path to the per-host iru+slot directory is passed in as $1,         
                    # e.g. /var/lib/sgi/per-host/<imagename>/i2n11.                                
                    #                                                                              
                    
                    # sanity checks
                    . /opt/sgi/share/per-host-customization/global/sanity.sh
                    
                    iruslot=$1
                    os=( $(/opt/oscar/scripts/distro-query -i ${iruslot} | sed -n '/^compat /s/^compat.*: //p') )
                    
                    compatdistro=${os[0]}${os[1]}                                                   
                    
                    if [ ${compatdistro} = "sles10" -o ${compatdistro} = "sles11" ]; then
                    
                            #
                            # SLES 10 compatible
                            #
                            cat <<EOF >${iruslot}/etc/fstab
                    # <file system> <mount point>   <type>  <options>       <dump>  <pass>
                    tmpfs           /tmp            tmpfs   size=150m       0       0
                    EOF
                    
                    elif [ ${compatdistro} = "rhel5"  ]; then
                    
                            #
                            # RHEL 5 compatible
                            #
                    
                            #
                            # RHEL expects several subsys directories to be present under /var/run
                            # and /var/lock, hence no tmpfs mounts for them
                            #
                            cat <<EOF >${iruslot}/etc/fstab
                    # <file system> <mount point>   <type>  <options>       <dump>  <pass>
                    tmpfs           /tmp            tmpfs   size=150m       0       0
                    devpts          /dev/pts        devpts  gid=5,mode=620  0       0
                    EOF
                    
                    else
                    
                            echo -e "\t$(basename ${0}): Unhandled OS.  Doing nothing"
                    
                    fi
                    

                  3. Push the image out to the racks to pick up the change, as follows:

                    # cimage --push-rack mynewimage r\*

                    For more information on using the cimage command, see “cimage Command”.

                  Enabling or Disabling the Compute Node iSCSI Swap Device

                  This section describes how to enable or disable the Internet small computer system interface (iSCSI) compute node swap device. The iSCSI compute node swap device is turned off by default for new installations. It can cause problems during rack-wide out of memory (OOM) conditions, with both compute nodes and the rack leader controller (RLC) becoming unresponsive during the heavy write-out to the per-node iSCSI swap devices.

                  Procedure 3-9. Enabling the iSCSI Swap Device

                    If you wish to enable the iSCSI swap device in a given compute node image, perform the following steps:

                    1. Change root ( chroot) into the compute node image on the admin node and enable the iscsiswap service, as follows:

                      # chroot /var/lib/systemimager/images/compute-sles11 chkconfig iscsiswap on

                    2. Then, push the image out to the racks, as follows:

                       # cimage --push-rack compute-sles11 r\*

                    Procedure 3-10. Disabling the iSCSI Swap Device

                      To disable the iSCSI swap device in a compute node image where it is currently enabled, perform the following steps:

                      1. Disable the service, as follows:

                        # chroot /var/lib/systemimager/images/compute-sles11 chkconfig iscsiswap off

                      2. Then, push the image out to the racks, as follows:

                        # cimage --push-rack compute-sles11 r\*

                      Changing the Size of Per-node Swap Space

                      This section describes how to change per-node swap space on your SGI Altix ICE system.

                      Procedure 3-11. Increasing Per-node Swap Space

                        To increase the default size of the per-blade swap space on your system, perform the following:

                        1. Shutdown all blades in the affected rack (see “Shutting Down and Booting”).

                        2. Log into the leader node for the rack in question. (Note that you need to do this on each rack leader).

                        3. Change directory (cd) to the /var/lib/sgi/swapfiles directory.

                        4. To adjust the swap space size appropriate for your site, run a script similar to the following:

                          #!/bin/bash
                          
                          size=262144     # size in KB
                          
                          for i in $(seq 0 3); do
                                  for n in $(seq 0 15); do
                                          dd if=/dev/zero of=i${i}n${n} bs=1k count=${size}
                                          mkswap i${i}n${n}
                                  done
                          done

                        5. Reboot the all blades in the affected rack (see “Shutting Down and Booting”).

                        6. From the rack leader node, use the cexec --all command to run the free (1) command on the compute blades to view the new swap sizes, as follows:

                          r1lead:~ # cexec --all free
                          ************************* rack_1 *************************
                          --------- r1i0n0---------
                                       total       used       free     shared    buffers     cached
                          Mem:       2060140     206768    1853372          0          4      46256
                          -/+ buffers/cache:     160508    1899632
                          Swap:        49144          0      49144
                          --------- r1i0n1---------
                                       total       used       free     shared    buffers     cached
                          Mem:       2060140     137848    1922292          0          4      44200
                          -/+ buffers/cache:      93644    1966496
                          Swap:        49144          0      49144
                          --------- r1i0n8---------
                                       total       used       free     shared    buffers     cached
                          Mem:       2060140     138076    1922064          0          4      43172
                          -/+ buffers/cache:      94900    1965240
                          Swap:        49144          0      49144

                        If you want change per-node swap space across your entire system, all (new) leaders nodes as part of discovery, you can edit the /etc/opt/sgi/conf.d/35-compute-swapfiles “inside” the lead-sles11 image on the admin node. The images are in the /var/lib/systemimager/images directory. For more information on customizing these images, see “Customizing Software Images”.

                        Switching Compute Nodes to a tmpfs Root

                        This section describes how to switch your system compute nodes to a tmpfs root.

                        Procedure 3-12. Switching Compute Nodes to a tmpfs Root

                          To switch your compute nodes to a tmpfs root, from the system admin controller (admin node) perform the following steps:

                          1. To switch compute nodes to a tmpfs root, use the optional --tmpfs flag to the cimage --set command, for example:

                            adminadmin:~ # cimage --set --tmpfs compute-sles11 2.6.27.19-5-smp r1i0n0


                            Note: To use a /tmpfs root with the standard compute node image, the compute node needs to have 4GB of memory or above. A standard /tmpfs mount has access to half the system memory, and the standard compute node image is just over 1 GB in size.


                          2. You can view the current setting of a compute node, as follows:

                            admin:~ # cimage --list-nodes r1i0n0
                            r1i0n0: compute-sles11 2.6.27.19-5-smp tmpfs

                          3. To set it back to an NFS root, use the --nfs flag to the cimage --set command, as follows:

                            admin:~ # cimage --set --nfs compute-sles11 2.6.27.19-5-smp r1i0n0

                          4. You can change the view the change back to NFS root, as follows:

                            admin:~ # cimage --list-nodes r1i0n0
                            r1i0n0: compute-sles11 2.6.27.19-5-smp nfs

                            For help information, use the cimage --h option.

                          Setting up Local Storage Space for Swap and Scratch Disk Space

                          The SGI Altix ICE 8400 system has the option to support local storage space on compute nodes (also known as blades). Solid-state drive (SSD) devices and 2.5" disks are available for this purpose. You can define the size and status for both swap and scratch partitions. The values can be set globally or per node or group of nodes. By default, the disks are partitioned only if blank, the swap is off, and the scratch is set to occupy the whole disk space and be mounted in /tmp/scratch.

                          The /etc/init.d/set-swap-scratch script is responsible for auto-configuring the swap and scratch space based on the settings retrieved via the cattr command. You can use the cadmin to configure settings globally or you can use the cattr command to set custom values for specific nodes.

                          The /etc/opt/sgi/conf.d/30-set-swap-scratch script makes sure /etc/init.d/swapscratch service is on so that swap/scratch partitions are configured directly after booting. The swapscratch service calls the /opt/sgi/lib/set-swap-scratch script when the service is started and then it exits.

                          You can customize the following settings:

                          • blade_disk_allow_partitioning

                            The default value is "on" which means that the set-swap-scratch script will repartition and format the local storage disk if needed.


                            Note: To protect user data, the script will not re-partition the disk if it is already partitioned. In this case, you need a blank disk before it can be used for swap/scratch.


                            The set-swap-scratch script uses the following command to retrieve the blade_disk_allow_partitioning value for the node on which it is running:

                            # cattr get blade_disk_allow_partitioning -N $compute_node_name --default on

                            You can globally set the value on, as follows:

                            # cadmin --add-attribute --string-data on blade_disk_allow_partitioning

                            You can globally turn if off, as follows:

                            # cadmin --add-attribute --string-data off blade_disk_allow_partitioning

                          • blade_disk_swap_status

                            The default value is "off" which means that the set-swap-scratch script will not enable a swap partition on the local storage disk.

                            The set-swap-scratch script uses the following command to retrieve the blade_disk_swap_status value for the node on which it is running:

                            # cattr get blade_disk_swap_status -N $compute_node_name --default off

                            You can globally set the value on, as follows:

                            # cadmin --add-attribute --string-data on blade_disk_swap_status

                            You can globally turn if off, as follows:

                            # cadmin --add-attribute --string-data off blade_disk_swap_status

                            The set-swap-scratch script uses SGI_SWAP label when partitioning the disk. It enables the swap only if it finds a partition labeled SGI_SWAP.

                          • blade_disk_swap_size

                            The default value is 0 which means that the set-swap-scratch script will not create a swap partition on the local storage disk.

                            The set-swap-scratch script uses the following command to retrieve the blade_disk_swap_size value for the node on which it is running:

                            attr get blade_disk_swap_size -N $compute_node_name --default 0

                            You can globally set the value, as follows:

                            # cadmin --add-attribute --string-data 1024 blade_disk_swap_size

                            The size is specified in megabytes. Allowed values are, as follows: 0, -0 (use all free space when partitioning), 1, 2, ...

                          • blade_disk_scratch_status

                            The default value is "off" which means that the set-swap-scratch script will not enable the scratch partition on the local storage disk.

                            The set-swap-scratch script uses the following command to retrieve the blade_disk_scratch_status value for the node on which it is running:

                            cattr get blade_disk_scratch_status -N $compute_node_name --default off

                            You can globally set the value on, as follows:

                            # cadmin --add-attribute --string-data on blade_disk_scratch_status

                            You can globally turn if off, as follows:

                            cadmin --add-attribute --string-data off blade_disk_scratch_status


                            Note: The set-swap-scratch script uses the SGI_SCRATCH label when partitioning the disk. It mounts the scratch only on the partition labeled as SGI_SCRATCH .


                          • blade_disk_scratch_size

                            The default value is -0 which means that the set-swap-scratch script will use all remaining free space when creating the scratch partition.

                            The set-swap-scratch script uses the following command to retrieve the blade_disk_scratch_size value for the node on which it is running:

                            cattr get blade_disk_scratch_size -N $compute_node_name --default -0 

                            You can globally set the value, as follows:

                            cadmin --add-attribute --string-data 10240 blade_disk_scratch_size

                            The size is specified in megabytes. Allowed values are, as follows: 0, -0 (use all free space when partitioning), 1, 2, ...

                          • blade_disk_scratch_mount_point

                            The default value is /tmp/scratch which means that the set-swap-scratch script will mount the scratch partition in /tmp/scratch.

                            The set-swap-scratch script uses the following command to retrieve the blade_disk_scratch_size value for the node on which it is running:

                            # cattr get blade_disk_scratch_mount_point -N $compute_node_name --default /tmp/scratch

                            You can globally set the value, as follows:

                            # cadmin --add-attribute --string-data /tmp/scratch blade_disk_scratch_mount_point

                            You can mount the disk to any mount point you desire. The set-swap-scratch script will create that folder if it does not exists (as long as the script has the permission to create it at that path). The root mount point (/) is not writable on the compute nodes. You need to create that folder as part of the compute node image if you want to mount something like /scratch.

                          For a cattr command help statement, perform the following command:

                          # cattr -h
                          Usage:
                            cattr [--help] COMMAND [ARG]...
                          
                          Commands:
                            exists   check for the existence of an attribute
                            get      print the value of an attribute
                            list     print a list of attribute values
                            set      set the value of an attribute
                            unset    delete the value of an attribute
                          
                          For more detailed help, use 'cattr COMMAND --help'.

                          Viewing the Compute Node Read-Write Quotas

                          This section describes how to view the per compute node read and write quota.

                          Procedure 3-13. Viewing the Compute Node Read-Write Quotas

                            To view the per compute node read and write quota, log onto the leader node and perform the following:

                            r1lead:~ # xfs_quota -x -c 'quota -ph 1'
                            Disk quotas for Project #1 (1)
                            Filesystem   Blocks  Quota  Limit Warn/Time    Mounted on
                            /dev/disk/by-label/sgiroot
                                          64.6M      0     1G  00 [------] /
                            

                            Map the XFS project ID to the quota you are interested in by looking it up in /etc/projects file.

                            If you decided to change the xfs_quota values, log back onto the admin node and edit the /etc/opt/sgi/cminfo file inside the compute image where you want to change the value, for example, /var/lib/systemimager/images/ image_name. Change the value of the PER_BLADE_QUOTA variable and then repush the image with the following command:

                            # cimage --push-rack image_name racks

                            For help information, perform the following:

                            xfs_quota> help
                            df [-bir] [-hn] [-f file] -- show free and used counts for blocks and inodes
                            help [command] -- help for one or all commands
                            print -- list known mount points and projects
                            quit -- exit the program
                            quota [-bir] [-gpu] [-hnv] [-f file] [id|name]... -- show usage and limits
                            
                            Use 'help commandname' for extended help

                            Use help commandname for extended help, such as the following:

                            xfs_quota> help quota
                            
                            quota [-bir] [-gpu] [-hnv] [-f file] [id|name]... -- show usage and limits
                            
                             display usage and quota information
                            
                             -g -- display group quota information
                             -p -- display project quota information
                             -u -- display user quota information
                             -b -- display number of blocks used
                             -i -- display number of inodes used
                             -r -- display number of realtime blocks used
                             -h -- report in a human-readable format
                             -n -- skip identifier-to-name translations, just report IDs
                             -N -- suppress the initial header
                             -v -- increase verbosity in reporting (also dumps zero values)
                             -f -- send output to a file
                             The (optional) user/group/project can be specified either by name or by
                             number (i.e. uid/gid/projid).
                            
                            xfs_quota> 

                            RAID Utility

                            The infrastructure nodes on your Altix ICE system have LSI RAID enabled by default from the factory. A lsiutil command-line utility is included with the installation for the admin node, the leader node, and the service node (when installed from the SGI service node image). This tool allows you to look at the devices connected to the RAID controller and manage them. Some functions, such as, setting up mirrored or striped volumes, can be handled either by the LSI BIOS configuration tool or the lsiutil utility.


                            Note: These instructions only apply to Altix XE250 or Altix XE270 systems with the 1068-based controller. They do not apply to Altix XE250 or Altix XE270 systems that have the LSI Megaraid controller.


                            Example 3-7. Using the lsiutil Utility

                            The following lsiutil command-line utility example shows a sample session, as follows:

                            Start the lsiutil tool, as follows:

                            admin:~ # lsiutil
                            
                            LSI Logic MPT Configuration Utility, Version 1.54, January 22, 2008
                            
                            1 MPT Port found
                            
                                 Port Name         Chip Vendor/Type/Rev    MPT Rev  Firmware Rev  IOC
                             1.  /proc/mpt/ioc0    LSI Logic SAS1068E B2     105      01140100     0
                            
                            Select a device:  [1-1 or 0 to quit] 

                            Select 1 to show the MPT Port , as follows:

                            1 MPT Port found
                            
                                 Port Name         Chip Vendor/Type/Rev    MPT Rev  Firmware Rev  IOC
                             1.  /proc/mpt/ioc0    LSI Logic SAS1068E B2     105      01140100     0
                            
                            Select a device:  [1-1 or 0 to quit] 1
                            
                             1.  Identify firmware, BIOS, and/or FCode
                             2.  Download firmware (update the FLASH)
                             4.  Download/erase BIOS and/or FCode (update the FLASH)
                             8.  Scan for devices
                            10.  Change IOC settings (interrupt coalescing)
                            13.  Change SAS IO Unit settings
                            16.  Display attached devices
                            20.  Diagnostics
                            21.  RAID actions
                            22.  Reset bus
                            23.  Reset target
                            42.  Display operating system names for devices
                            45.  Concatenate SAS firmware and NVDATA files
                            60.  Show non-default settings
                            61.  Restore default settings
                            69.  Show board manufacturing information
                            97.  Reset SAS link, HARD RESET
                            98.  Reset SAS link
                            99.  Reset port
                             e   Enable expert mode in menus
                             p   Enable paged mode in menus
                             w   Enable logging
                            
                            Main menu, select an option:  [1-99 or e/p/w or 0 to quit] 

                            Choose 21. RAID actions, as follows:

                            Main menu, select an option:  [1-99 or e/p/w or 0 to quit] 21
                            
                             1.  Show volumes
                             2.  Show physical disks
                             3.  Get volume state
                             4.  Wait for volume resync to complete
                            23.  Replace physical disk
                            26.  Disable drive firmware update mode
                            27.  Enable drive firmware update mode
                            30.  Create volume
                            31.  Delete volume
                            32.  Change volume settings
                            50.  Create hot spare
                            99.  Reset port
                             e   Enable expert mode in menus
                             p   Enable paged mode in menus
                             w   Enable logging
                            
                            RAID actions menu, select an option:  [1-99 or e/p/w or 0 to quit] 

                            Choose 2. Show physical disks, to show the status of the disks making up the volume, as follows:

                            RAID actions menu, select an option:  [1-99 or e/p/w or 0 to quit] 2
                            
                            1 volume is active, 2 physical disks are active
                            
                            PhysDisk 0 is Bus 0 Target 1
                              PhysDisk State:  online
                              PhysDisk Size 238475 MB, Inquiry Data:  ATA      Hitachi HDT72502 A73A
                            
                            PhysDisk 1 is Bus 0 Target 2
                              PhysDisk State:  online
                              PhysDisk Size 238475 MB, Inquiry Data:  ATA      Hitachi HDT72502 A73A
                            
                            RAID actions menu, select an option:  [1-99 or e/p/w or 0 to quit] 

                            Choose 1. Show volumes, to show information about the volume including its health, as follows:

                            RAID actions menu, select an option:  [1-99 or e/p/w or 0 to quit] 1
                            
                            1 volume is active, 2 physical disks are active
                            
                            Volume 0 is Bus 0 Target 0, Type IM (Integrated Mirroring)
                              Volume Name:                                  
                              Volume WWID:  09195c6d31688623
                              Volume State:  optimal, enabled
                              Volume Settings:  write caching disabled, auto configure
                              Volume draws from Hot Spare Pools:  0
                              Volume Size 237464 MB, 2 Members
                              Primary is PhysDisk 1 (Bus 0 Target 2)
                              Secondary is PhysDisk 0 (Bus 0 Target 1)
                            
                            RAID actions menu, select an option:  [1-99 or e/p/w or 0 to quit] 


                            Restoring the grub Boot-loader on a Node

                            When grub(8) boot-loader is not written to the rack leader controllers (leader nodes) or any of the system service nodes or is not functioning correctly, the grub boot-loader will have to be re-installed on the master boot record (MBR) of the root drive for the node.

                            To rewrite grub to the MBR of the root drive on a system that is booted, issue the following grub commands:

                            # grub
                            grub>    root (hd0,0)
                            grub>    setup (hd0)
                            grub>    quit

                            If you cannot boot your system (and it is hanging on grub ), you need to boot the node in rescue mode and then issue the following commands:

                            # mount /dev/ /system
                            # mount -o bind /dev /system/dev
                            # mount -t proc proc /system/proc # optional
                            # mount -t sysfs sysfs /system/sys # optional
                            # chroot /system
                            # grub
                            grub>    root (hd0,0)
                            grub>    setup (hd0)
                            grub>    quit
                            # reboot

                            Backing up and Restoring the System Database

                            The SMC for Altix ICE systems management software captures the relevant data for the managed objects in an SGI Altix ICE system. Managed objects are the hierarchy of nodes described in “Basic System Building Blocks” in Chapter 1. The system database is critical to the operation of your SGI Altix ICE system and you need to back up the database on a regular basis.

                            Managed objects on an SGI Altix ICE include the following

                            • Altix ICE system

                              One ICE system is modeled as a meta-cluster. This meta-cluster contains the racks each modeled as a sub-cluster.

                            • Nodes

                              System admin controller (admin node), rack leader controllers (leader nodes), service nodes, compute nodes (blades) and chassis management control blades (CMCs) are modeled as nodes.

                            • Networks

                              The preconfigured and potentially customized IP networks

                            • Nics

                              The network interfaces for Ethernet and InfiniBand adapters.

                            • The network interfaces for Ethernet and InfiniBand adapter.

                              The node images installed on each particular node.

                            SGI recommends that you keep three backups of your system database at any given time. You should implement a rotating backup procedure following the son-father-grandfather principle.

                            Procedure 3-14. Backing up and Restoring the System Database

                              To back up and restore the system database, perform the following steps:


                              1. Note: A password is required to use the mysqldump command. The password file is located in the /etc/odapw file.


                                From the system admin controller, to back up the system database perform a command similar to the following:
                                # mysqldump --opt oscar > backup-file.sql
                                

                              2. To read the dump file back into the system admin controller, perform a command similar to the following:

                                # mysql oscar < backup-file.sql

                              For more information, see the mysqldump(1) man page.

                              Enabling EDNS

                              Extension mechanisms for DNS (EDNS) can cause excessive logging activity when not working properly. SMC on Altix ICE contains code to limit EDNS logging. This section describes how to delete this code and allow EDNS to work unrestricted and log messages.

                              Procedure 3-15. Enabling EDNS

                                To enable EDNS on your Altix ICE system, perform the following steps:

                                1. Open the /opt/sgi/lib/Tempo/Named.pm file with your favorite editing tool.

                                2. To remove the limit on the edns_udp_size parameter, comment out or remove the following line:

                                  $limit_edns_udp_size = "edns-udp-size 512;";"

                                3. Remove the following lines so that EDNS logging is no longer disabled:

                                  logging {
                                  category lame-servers {null; };
                                  category edns-disabled { null; };  };

                                Firmware Management

                                The fwmgr tool and its associated libraries form a firmware update framework. This framework makes managing the various firmware types in a cluster easier.

                                A given cluster may have several types of firmware including mainboard BIOS, BMC, disk controllers, InfiniBand (ib) interfaces, /Ethernet NICs, network switches, and many other types.

                                The firmware management tools allow the firmware to be stored in a central location (firmware bundle library) to be accessed by command line or graphical tools. The tools allow you to add firmware to the library, remove firmware from the library, install firmware on a given set of nodes, and other related operations.

                                License Requirement

                                This framework is licensed. It cannot be used without the appropriate license.

                                Terminology

                                This section describes some terminology associated with the firmware management, as follows:

                                • Raw firmware file

                                  These are files that you download, likely from SGI, that include the firmware and option tools to flash said firmware. For example, a raw firmware file for an Altix ICE compute node BIOS update might be downloaded as, sgi-ice-blade-bios-2009.12.14-1.x86_64.rpm.

                                • Firmware bundle

                                  A firmware bundle is a file that contains the firmware to be flashed in a way that the integrated tools understand. Normally, firmware bundles are stored in the firmware bundle library (see below). However, these bundles can also be checked out of the library and accessed directly in some cases. In most situations, a firmware bundle is a sort of wrapper around the raw firmware file(s) and various attributes and tools. A firmware bundle can contain more than one type of firmware. This is the case when the underlying flash tool supports more than one firmware type. An example of this is the SGI ICE compute node firmware, that contains several different BIOS files for different mainboards and multiple BMC firmware revisions. Another example might be a raw file that includes both the BIOS and BMC firmware for a given mainboard/server.

                                • Firmware bundle library

                                  This is a storage repository for firmware bundles. The management tools allow you to query the library for available bundles and associated attributes.

                                • Update environment

                                  Some raw firmware types, like the various Altix ICE firmware released as RPMs, run "live" on the admin node to facilitate flashing. The underlying tool may indeed set nodes up to network boot a low level flash tool, but there are many other methods used by the underlying tools. Some firmware types, like BIOS ROMs with associated flash executables, require an update environment to be constructed. One type of update environment is a DOS Update Environment. This update environment may be used, for example, to construct a DOS boot image for the BIOS ROM and associated flash tool. A firmware bundle calls for a specific update environment. In this way, a firmware bundle with an associated update environment form the necessary pieces to facilitate booting of a DOS update environment over the network that flashes the target nodes with the specified BIOS ROM (as an example).

                                Firmware Update High Level Example

                                This section describes the steps you need to take to update a set of nodes in your cluster with a new BIOS level, as follows:

                                • Download the raw firmware file for this system type. You might do this, for example, from SGI Supportfolio web site located at https://support.sgi.com/login.

                                • Add the raw firmware file to the firmware bundle library using a graphical or command line tool.

                                • The tool will convert the raw firmware file into a firmware bundle and store it in the firmware bundle library. In some cases, you will be required to provide additional information in order to convert the raw firmware file into a firmware bundle. This could be information necessary to facilitate flashing that the framework can not derive from the file on its own.

                                • Once the firmware bundle is available in the firmware library, you can use the graphical or command line tool to select a firmware bundle and a list of target nodes to which to push the firmware update.

                                • The underlying tool then creates the appropriate update environment (if required) and facilitates flashing of the nodes.

                                Firmwware Manager Command Line Interface (fwmgr)

                                The fwmgr command is the command line interface (CLI) to the firmware update infrastructure.

                                For a usage statement, enter fwmgr --help. The fwmgr command has several sub-commands, each of which can be called with the --help option for usage information.

                                You can use the fwmgr command to perform the following:

                                • List the available firmware bundles

                                • Add raw firmware files or firmware bundle files to the firmware bundle library. If it is a raw firmware type, it will be converted to a firmware bundle and placed in the library.

                                • Remove firmware bundles from the firmware bundle library

                                • Rename an existing firmware bundle in the firmware bundle library

                                • Install a given firmware bundle on to a list of nodes

                                • Checkout a firmware bundle which allows you to store the firmware bundle itself


                                Note: It is currently not necessary to run the fwmgrd command (firmware manager daemon) to use the CLI.


                                Firmwware Manager Daemon (fwmgrd)

                                This fwmgrd daemon is installed and enabled by default in SGIMC 1.3 on SGI Altix ICE systems, only. This daemon provides the services needed for the SGI Management Center graphical user interface to communicate with the firmware management infrastructure. This daemon needs to be running in order to access firmware management from the graphical user interface.

                                Even if you intend to only use the CLI, it is recommended that the fwmgrd daemon be left running and available.

                                By default, the fwmgrd log file is located at:

                                /var/log/fwmgrd.log

                                View this log for important messages during flashing operations from the SGI Management Center graphical interface.

                                Notes specific to Management Center 1.3

                                The first release of the Firmware Management framework only supports SGI Altix ICE firmware, released as RPMs. This includes: sgi-ice-blade-bios , sgi-ice-blade-ib, sgi-ice-blade-zoar , sgi-ice-cmc, and sgi-ice-ib-switch . This includes the Altix ICE compute nodes but does not yet include other managed node types.

                                SGI intends to expand this firmware management framework to support additional node types in Altix ICE and SGI Rackable cluster hardware in later releases.


                                Note: SGI Altix ICE integrated InfiniBand switches are supported but only on SGI Altix ICE 8400 series systems or later. Some integrated InfiniBand switch parts in the SGI Altix ICE 8200 series systems will not flash properly with this framework.