SAN LUNs in SMS

I previously discussed how important it is to verify LUN IDs before writing over them in AIX. What about before AIX is booted in SMS? How can you verify your LUNs in SMS?

Overview of Boot from SAN

It's common in larger AIX environments to boot directly from SAN storage rather than internal drives. SANs provide reliable fast disks for applications, so why not apply that to your OS as well?

Internal drives used to boot AIX are typically "Just a Bunch Of Drives" (JBOD), and the AIX LVM is used to mirror the rootvg across them for redundancy. Increasing virtualization and competition for resources between LPARs (ie: internal bays need split backplanes) has made booting from SAN more common. Features like SAN mirroring and snapshots provide more incentive to do so.

Boot from SAN recommendations

In the past booting AIX from SAN was complicated, but in recent years AIX's built-in MPIO drivers have made it a much smoother experience.

Recent firmware updates in POWER9 firmware 950 [1] and higher have made working with SAN LUNs in SMS easier as well! The ioinfo command at the open firmware prompt was also removed [2] and replaced by this new functionality.

Before moving to booting from SAN, please consider some of the following items.

Redundancy

When a system boots from SAN, it can no longer recover as easily from a SAN outage as a system with local boot drives and application data on the SAN. Now a SAN interruption could mean a hard crash instead of a few minutes of hung IO.

Systems intending to boot from SAN should have redundant HBA cards, not just redundant HBA ports. Those cards should link to two fabrics for redundancy. This may mean using only the top port in an HBA card, and connecting multiple adapters.

Single Points Of Failure (SPOFs) should be avoided at all costs. Review your HBAs, cables, and switch connections to eliminate any SPOFs.

Virtualized LPARs should use dual redundant PowerVM VIO servers, and be mindful of the HBA card topology to ensure they are on two fabrics.

/sms/SPOF.svg

Third party drivers

AIX's boot from SAN is still best on IBM storage with MPIO. Third party storage which use drivers other than MPIO should be approached with caution.

One option when third party drivers are in use is to rely on locally booted PowerVM VIO servers and VSCSI. The VIO server has the third party driver installed, the rootvg LUNs for all clients are mapped to the VIO servers, and then those LUNs are mapped to client LPARs using VSCSI.

The VIO clients benefit from the SAN storage and they use built-in VSCSI drivers during boot. The third party driver is on VIO where it can be administered. The client LPARs can then use the third party driver with NPIV adapters to access SAN storage for applications.

/sms/BootTypes.svg

Virtualization

In an environment with PowerVM VIO services, booting clients using NPIV adapters from SAN LUNs minimizes the VIO administration and complexity. This is an ideal configuration.

Consider whether the VIO servers should boot locally from internal drives. In the event of a SAN issue, it's useful to have VIO servers online to perform troubleshooting. That's not an option without local boot media.

An excellent compromise is to architect systems with a split backplane and four local disks which can host two redundant VIO servers. Each boots locally with mirrored disks. Then all client LPARs can boot directly from SAN using NPIV.

PowerHA Clusters

Special attention must be paid to PowerHA clusters which boot from SAN. In a PowerHA cluster the rootvg is a critical VG and monitored by the cluster watchdog. If an IO delay due to SAN issues prevents IO to the rootvg for a full minute, the cluster nodes will deliberately crash to trigger a failover to a surviving node.

Cross site mirroring can cause problems here if SAN storage blocks IO during a inter-site communication problem.

As a result, I recommend keeping PowerHA clusters booting from local drives where complex SAN configurations are present. Either they can use local drive directly, a combination of VIO VSCSI mappings, or one local and one SAN drive mirrored in rootvg.

One mapping during deployment

If only one LUN is mapped to our system during deployment, that certainly helps narrow down the correct disk. I recommend for a new system being deployed to only map the rootvg LUN. Other LUNs may be mapped after AIX is installed.

Snapshots and rootvg

SAN storage often has the ability to take an instant point in time snapshot of LUNs, for backup or restore later. This can also be a tool used for AIX upgrades.

IBM has published best practices for snapshots including the use of the chfs freeze command to pause IO to filesystems and flush their buffers. Without applying these best practices, a "dirty" snapshot may be taken which could have corruption or data inconsistencies.

Unlike data volume groups, you should not freeze the rootvg during operation. The best way to take a SAN snapshot of the rootvg is with the LPAR shutdown.

Alternatively I recommend adding a second bootable LUN and using the alt_disk_copy command instead to create a bootable snapshot.

LUN Characteristics

It's a common best practice to keep the rootvg as small as possible to minimize the space required for taking image backups via mksysb. Rootvg should also have a relatively light IO workload compared to application volume groups.

As a result, I often will request rootvg LUNs which are provisioned as thin allocated at a modest size (50G) with optional compression on the SAN backend. These LUNs can also be on a slower storage type.

Thin allocation means that the SAN will only allocate storage to the LUN when it is written to. This is a useful fiction to allow over-committing on the SAN array. The rootvg should stay small, making it an excellent candidate for thin allocation.

Compression or slower economical backend storage is useful to minimize the rootvg footprint, as the load is not demanding and latency is less of an issue.

Installing AIX on SAN LUNs

Unfortunately the AIX installer and the bootable mksysb restore programs show very little information about disks. They are kept minimal so they can be booted from CD or network image, and do not have a full operating system to leverage for storage information.

Lacking an OS also makes discovering our port status and WWPNs more difficult. Without an OS keeping the ports online, many of the switch and storage GUI configuration tools won't see our system to allow the SAN administrator to configure resources. The recent improvements in SMS can help solve those problems.

To prepare a system for zoning, mapping, and boot from SAN start by activating the LPAR into SMS. The following instructions assume the LPAR is already at the SMS menu and that the system console is available.

Devices in SMS

The 950 firmware replaced the third option on the main SMS menu with a new item, "I/O Device Information". Many device types are available, however this document focuses on "SAN" and "FCP" ports in the sub menus.

Broadcast to assist in zoning

Testing SAN cables for connectivity is always worthwhile. Bringing the link up can also advertise our WWPNs to the switch and storage.

This is a great time to have the SAN administrator open any GUI tool they use with the switch and storage so they can confirm the ports light up briefly, and WWPN entry widgets populate with WWPNs.

After navigating to the SAN option, then the FCP option, a list of adapters is presented:

Select Media Adapter

1.  U7888.ND1.CABC123-P1-C3-T1    /pci@800000020000072/fibre-channel@0
2.  U7888.ND1.CABC123-P1-C6-T1    /pci@800000020000099/fibre-channel@0

When an adapter is selected, the port will be brought online temporarily. The adapter will try to find a link, and if one is found it will try to inventory the LUNs available.

If the port is not connected, a failure message like this appears:

                              .------------------.
                              |  PLEASE WAIT.... |
                              `------------------'

Link down
Cannot Init Link.

                         .----------------------------.
                         |  No SAN devices present    |
                         |  Press any key to continue |
                         `----------------------------'

That's pretty clear! Check that cable and try again.

Once this succeeds, the SAN zones can be created on the switch to allow our ports to communicate with the SAN storage controllers. The ports communicated with the switch and should now be on record for configuration.

Map LUNs to host

If zones are newly created, go back and execute the test on that adapter again to advertise our port to the SAN storage controller.

If the storage controller is located but no LUNs are found, we're presented with a list of unrecognized devices which can repeat when there are multiple paths to the controller:

Select Attached Device
 Pathname: /pci@800000020000072/fibre-channel@0
 WorldWidePortName: 100000abc1234554

1.  5005012312312319,0                 Unrecognized device type: 3f
2.  500501231231231c,0                 Unrecognized device type: 3f
3.  500501231231231c,0                 Unrecognized device type: 3f
4.  5005012312312319,0                 Unrecognized device type: 3f

The system should have a host record created on the storage controller and our rootvg LUN mapped to us. The SAN administrator should also provide the LUN ID.

Detect LUNs

Scan once more, and one or more LUNs should be visible:

Select Attached Device
 Pathname: /pci@800000020000072/fibre-channel@0
 WorldWidePortName: 100000abc1234554

1.  5005012312312319,0                 107 GB Disk drive
2.  500501231231231c,1000000000000     107 GB Disk drive

Confirm LUN identity

Selecting a LUN device shows additional details:

SAN Device Menu

 Target Address: 500501231231231c  Lun Address: 0
 Pathname: /pci@800000020000072/fibre-channel@0/disk@500501231231231c,0
 Device: 107 GB Disk drive
 BUID: IBM-2145-60050768123123123123000000000222

The BUID is the LUN serial number!

Before booting to the AIX installer, take note of which mappings (ie: "500501231231231c,0") correspond to which serial number (BUID). The installer only shows the mapping number.

Minimal information in the installer

When booted into the installer, I always choose "Change/Show Installation Settings and Install" from the main menu so I can select the disk and view the disk information:

          Name      Location Code   Size(MB)  VG Status   Bootable
  >>>  1  hdisk0    none            102400   none            Yes    No
       2  hdisk1    none            102400   none            Yes    No

>>>  0   Continue with choices indicated above

    66  Disks not known to Base Operating System Installation
    77  Display More Disk Information

The 77 choice will rotate the columns describing the drives:

        Name      Physical Volume Identifier
>>>  1  hdisk0    0000000000000000
     2  hdisk1    0000000000000000

77 choice again can show some additional information which includes the WWPN of the storage server and the incremental mapping number:

>>>  1  hdisk0    U78D4.ND1.ABC123K-P1-C10-T1-W500501231231231c-L0
     2  hdisk1    U78D4.ND1.ABC123K-P1-C10-T1-W500501231231231c-L1...

Unfortunately there's no serial number!

While it's often safe to write to a disk whose size matches expectations and has no PVID, that serial number is the only way to know.

Comparing the mapping and serial number information from SMS against the installer, you can now be certain of the correct hdisk to use.

View in AIX

Once AIX is installed the lspv -u command can be used to view the serial numbers for LUNs and compare to the information from SMS:

% lspv -u
hdisk0          00123123ad934732                    rootvg          active      33213500501231231231C480000000000022204214503IBMfcp                  2343caaf-b0aa-cd71-3d29-15adad9qcc28
hdisk1          00123123be048213                    altinst_rootvg              33213500501231231231C480000000000022304214503IBMfcp                  10b2226b-8224-6323-a642-82134e5e5954

Bonus Feature: Test Network ports

Cable testing is always frustrating before an OS is installed. SMS now includes a way to test the SAN connection as we discussed above. It also allows you to check network ports too!

Under the SMS main menu "I/O Device Information", there is now a listing for "Network ports":

I/O Device Information
 1.   SAN
 2.   SAS
 3.   NVMe
 4.   vSCSI
 5.   Firmware device driver Secure Boot validation failures
 6.   Network ports   <===============

This presents a list of network ports and can do a link test on them. It can also test all ports to find which ones have link.

 PowerPC Firmware
 Version FW950.30 (VM950_092)
 SMS (c) Copyright IBM Corp. 2000,2021 All rights reserved.
-------------------------------------------------------------------------------
 Network Port connectivity check menu
 Ports with * are connected to a network and are capable of network activity
  PCIe2 4-Port (10GbE SFP+ & 1Gb   U78D4.ND1.CSS9999-P1-C11-T1
  PCIe2 4-Port (10GbE SFP+ & 1Gb   U78D4.ND1.CSS9999-P1-C11-T2
  PCIe2 4-Port (10GbE SFP+ & 1Gb   U78D4.ND1.CSS9999-P1-C11-T3
  PCIe2 4-Port (10GbE SFP+ & 1Gb   U78D4.ND1.CSS9999-P1-C11-T4
  PCIe2 4-Port (10GbE SFP+ & 1Gb   U78D4.ND1.CSS9999-P1-C8-T1
  PCIe2 4-Port (10GbE SFP+ & 1Gb   U78D4.ND1.CSS9999-P1-C8-T2
  PCIe2 4-Port (10GbE SFP+ & 1Gb   U78D4.ND1.CSS9999-P1-C8-T3
  PCIe2 4-Port (10GbE SFP+ & 1Gb   U78D4.ND1.CSS9999-P1-C8-T4

These tools are very handy when remotely supporting a physical installation.