Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

AIX Alternate Disk Installation



AIX Alternate Disk Installation

Jeff Marsh

IIn this article, I will describe some tools within AIX (some new, some old) that can help you reduce the off-hours time spent by your administration staff during maintenance upgrades. I will also show you some uses for these same toolsets that can help you reduce recovery times due to rootvg corruption.

Alternate Disk Installation

What is it? According to the IBM AIX Installation Guide:

"Alternate disk installation, available in AIX Version 4.3, allows installing the system while it is up and running, allowing installation or upgrade down time to be decreased considerably."

Thus, with another set of bootable drives within a server, you can install maintenance (e.g., upgrade your system from AIX 4.3.3.04 to AIX 4.3.3.06) during the day without interruption or any effects to the running applications. However, you will still need a reboot to make it active.

The support model prior to Alternate Disk Installation required all work to be done off-hours during an application maintenance window that generally took two to four hours. Now you can reduce that off-hour time from two to four hours per server to just the time to reboot. I'll also show you how you can complete multiple upgrades in that same reboot window using Network Installation Manager (NIM).

Requirements

To enable Alternate Disk Installation, you need to install the following base-level filesets and upgrade to at least these corresponding fileset levels. These filesets do not require a reboot to install:

Base level filesets: Fileset levels:

bos.alt_disk_install.rte 26

bos.alt_disk_install.boot_images 27

You will also need another free, bootable drive within your server. In this case, you are configuring new servers with four internal drives for systems administration purposes: two drives for the primary rootvg mirrored, and two for alt_disk_install implementations. You could get by with just one additional drive, but we prefer to have two.

How It Works

Alternate Disk Installation works by cloning your primary rootvg running on hdisk0 and hdisk1, for example, to a second set of drives, hdisk2 and hdisk3. After the system completes those copies using basic find, backup, and restfile utilities, it will install the latest maintenance level you designate.

This process is shown in Figure 1. First, you clone hdisk0/1 to hdisk2/3, and then you apply maintenance to the newly cloned hdisk2/3 while the applications continue to run against hdisk0/1.

To complete this task from SMIT, issue the following fast path. You should expect to see the following panels:

smitty alt_clone

Clone the rootvg to an Alternate Disk:

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

* Target Disk(s) to install [hdisk2 hdisk3]

Phase to execute all

+

image.data file []

/

Exclude list []

/

Bundle to install [update_all]

+

-OR-

Fileset(s) to install []

Fix bundle to install []

-OR-

Fixes to install []

Directory or Device with images [/mnt]

(required if filesets, bundles or fixes used)

installp Flags

COMMIT software updates? yes

+

SAVE replaced files? no

+

AUTOMATICALLY install requisite software? yes

+

EXTEND file systems if space needed? yes

+

OVERWRITE same or newer versions? no

+

VERIFY install and check file sizes? no

+

Customization script []

/

Set bootlist to boot from this disk

on next reboot? yes

Reboot when complete? no

+

Verbose output? no

+

Debug output? no

+

[BOTTOM]

F1=Help F2=Refresh F3=Cancel

F4=List

F5=Reset F6=Command F7=Edit

F8=Image

F9=Shell F10=Exit Enter=Do

From the example above, note the following:

• You are cloning to hdisk2 and hdisk3.

• You are running an update_all operation of maintenance mounted in the /mnt mount point. (In this case, this is a CD-ROM with the AIX 4.3.3.06 maintenance filesets.)

• You are specifying that this operation should change our bootlist to hdisk2 and hdisk3 after completion.

• You are not asking the process to complete an immediate reboot upon completion of the upgrade because this is something you want to schedule in an appropriate maintenance window.

At the completion of the operation, you will notice from the bootlist -m normal -o command that the bootlist will be set to hdisk2 hdisk3, and issuing an lspv command will show the following:

root@aknimp1:/> lspv

hdisk0 000f261d90bf6ea0 rootvg

hdisk1 000f261dae86d104 rootvg

hdisk2 000f261db52d4d95 altinst_rootvg

hdisk3 000f261db52d4ca6 altinst_rootvg

hdisk4 000f018d07d4f412 None

hdisk5 000f261dbde71c66 None

hdisk6 000f261dbd8eea89 nimresvg

At this point, you have cloned and installed the latest AIX maintenance level during the day. You are now ready to activate that latest maintenance with a reboot operation at whatever time is appropriate for the outage to your application users. You can save significant off-hours time for maintenance upgrades; our off-hours time has been reduced to the time needed for a simple reboot.

Alternate Disk Installation -- After the Reboot

After the reboot, issue the oslevel command or complete the appropriate verifications to ensure your maintenance upgrade occurred as expected. If you issue the lspv command, you will notice the following:

root@aknimp1:/> lspv

hdisk0 000f261d90bf6ea0 old_rootvg

hdisk1 000f261dae86d104 old_rootvg

hdisk2 000f261db52d4d95 rootvg

hdisk3 000f261db52d4ca6 rootvg

hdisk4 000f018d07d4f412 None

hdisk5 000f261dbde71c66 None

hdisk6 000f261dbd8eea89 nimresvg

Both hdisk2 and hdisk3, from which you have booted, now show a volume group identifier of rootvg. Hdisks 0 and 1 now show a volume group of old_rootvg and are varied off.

Now, you have several options. My preference is to leave hdisk0 and hdisk1 alone with the old maintenance levels in case you need to fall back on them.

Let's assume that after the reboot your applications aren't working well with the latest maintenance. The previous support model suggests that you need to get the mksysb backup taken prior to your upgrade and begin a restore process. This could take two hours or more, with the hope that the tape image was good. The new support model with Alternate Disk Installation says to change your bootlist back to hdisk0 and hdisk1 and to reboot the server. At some future point, when you decide the maintenance is good and you don't need to fall back, you can clone the latest maintenance residing on hdisk2/3 back to hdisk0/1.

Cloning Back to hdisk0/1

To complete the cloning of hdisk2/3 back to hdisk0/1, you must issue the following commands:

alt_disk_install -W hdisk0 hdisk1 -- Wakes up the old_rootvg

alt_disk_install -S -- Puts the old_rootvg back to sleep

alt_disk_install -X old_rootvg -- Removes the old_rootvg volume group name associated with hdisk0/1 from the ODM and assigns them a value of "none", which will allow the cloning to recur cleanly.

smitty alt_clone -- Reclone back to hdisk0/1 using the previous example.

I will discuss using the above commands further in the next section; however, in order to reclone drives that have been previously used to boot from, you must follow the commands verbatim to remove the knowledge of the old_rootvg volume group name from the ODM.

Other Uses for alt_disk_install

Some other items that alt_disk_install may be helpful with are:

• Nightly backup of your system -- Using alt_disk_install, you can backup your system nightly (or at whatever frequency is appropriate) without having to manage mksysb tapes. If you suffer some type of rootvg corruption, either major or minor, you can restore using the data on the cloned drives.

mksysb Images -- The alt_disk_install command can be used to install images (AIX 4.3 or later) onto AIX 4.1 and later versions.

• You can also use alt_disk_install for recovery of corrupted files in rootvg and to reduce the size of logical volumes in rootvg, as described in the following sections.

Recovery of Corrupted Files in rootvg

If you suffer major corruption (hdisk failure), and the server crashes, and if you have cloned that data to another bootable drive, you could interface with SMS, for example, to change your bootlist to your other cloned drives and quickly recover the server.

If you suffer minor corruption within the rootvg where a file or a few files are corrupted or inadvertently deleted, you can wake up the cloned copy of the rootvg and copy those deleted or corrupted files back to the primary rootvg while the server is up and running.

In this example, you are booted against hdisk0/1 and have recently cloned the system to hdisk2/3. To access the cloned copy of the rootvg while the server is up and running, complete the following:

1. alt_disk_install -W hdisk2 hdisk3 -- Wakes up the cloned copy:

root@aknimp1:/> alt_disk_install -W hdisk2 hdisk3

Waking up altinst_rootvg volume group ...

Replaying log for /dev/alt_hd4.

2. From a df -k command, you will notice that the wake up command has mounted the alternate rootvg logical volumes, which are prefaced with /alt_inst prefix:

root@aknimp1:/> df -k

Filesystem 1024-blocks Free %Used Iused %Iused Mounted on

/dev/hd4 49152 5608 89% 1226 5% /

/dev/hd2 753664 5056 100% 19966 11% /usr

/dev/hd9var 16384 14340 13% 222 6% /var

/dev/hd3 32768 30376 8% 98 2% /tmp

/dev/lvexport 131072 126772 4% 41 1% /export

/dev/lv01 4980736 94468 99% 4546 1% /export/lpp_source

/dev/lv02 917504 448868 52% 29468 13% /export/spot

/dev/lvmksysb 15204352 3381328 78% 31 1% /export/mksysb

/dev/lvadmin 131072 126868 4% 25 1% /admin

/dev/hd1 16384 15820 4% 20 1% /home

/dev/lvadsm 16384 56 100% 21 1% /var/adsm

/dev/alt_hd4 49152 5704 89% 1192 5% /alt_inst

/dev/alt_lvadmin 131072 126868 4% 25 1% /alt_inst/admin

/dev/alt_hd1 16384 15820 4% 20 1% /alt_inst/home

/dev/alt_hd3 32768 30376 8% 98 2% /alt_inst/tmp

/dev/alt_hd2 753664 5056 100% 19966 11% /alt_inst/usr

/dev/alt_hd9var 16384 14380 13% 219 6% /alt_inst/var

/dev/alt_lvadsm 16384 1848 89% 20 1% /alt_inst/var/adsm

3. Copy the corrupted files from the appropriate alt_inst logical volume/filesystem. In this case, I corrupted my /etc/hosts file, so I will issue the following command to restore it from my latest cloned backup:

cp /alt_inst/etc/hosts /etc/hosts

4. When you have restored the required files, put the altinst_rootvg back to sleep, which will unmount the /alt_inst logical volumes/filesystems by issuing:

alt_disk_install -S

Reducing Logical Volumes Size Within the rootvg

Remember the pain associated with the need to reduce the size of a logical volume within the rootvg? It took a tape restore of the system to complete. Now, you can complete that reduction within a simple cloning process. The steps to complete that process are as follows:

1. Issue a mkszfile command to create the /image.data file.

2. Edit the /image.data file and specify SHRINK=yes in the logical_volume_policy stanza:

image_data:

IMAGE_TYPE= bff

DATE_TIME= Tue Oct 3 10:29:55 CDT 2000

UNAME_INFO= AIX aknimp1 3 4 000F261D4C00

PRODUCT_TAPE= no

USERVG_LIST= nimresvg

OSLEVEL= 4.3.3.10

logical_volume_policy:

SHRINK= yes

EXACT_FIT= no

ils_data:

LANG= en_US

3. Clone the rootvg to hdisk2 and hdisk3, specifying your customized /image.data file by issuing one of the following commands:

sm itty alt_clone (remember to specify the location of your image.data file on the image.data file prompt)

or

al t_disk_install -i/image.data -B -C hdisk2 hdisk3 (from the command line)

4. After the completion of the cloning operation, wake up the altinst_rootvg by issuing:

alt_disk_install -W hdisk2 hdisk3

5. Review your df -k output and compare the primary logical volume sizing to their /alt_inst counterparts.

6. If you are satisfied with the sizing reduction, change your bootlist (bootlist -m normal hdisk2 hdisk3) and reboot.

Network Installation Managment (NIM)

I want to briefly discuss NIM and show how well it interfaces with alternate disk installation. It can easily help you to manage upgrades on a group of servers, thus saving you even more time.

What Is NIM?

Paraphrasing from the AIX Network Installation Management Guide and Reference, "NIM is a base component of AIX and permits and aids in the installation and maintenance of AIX, it's basic operating system, and additional software and fixes that may be supplied over the network. NIM provides for the customization of machines both during and after installation. As a result, NIM has eliminated the reliance of the systems administration staff on tapes and CD-ROMs for software installation and maintenance."

In this case, you are using NIM to centrally manage a group of standalone machines (NIM clients) from a centrally located network attached to a NIM master. From the NIM master, you can manage operating system installations, maintenance upgrades, mksysb images for backup and recovery, installation of new servers (cloning), and the re-installation of existing servers in case of a disaster.

There's a great deal of functionality provided by NIM. I recommend reviewing the usage guide to see what NIM features could benefit your environment. I also recommend a good Redbook from IBM, NIM: From A to Z in AIX 4.3 (SG24-5524-00), which was published in February 2000.

I won't cover the specifics of setting up the NIM master and the corresponding NIM client configurations; it is not an overly complicated process. However, it will require someone with NIM-specific knowledge to lay out the functional NIM environment. If you support SP complexes, you have already had a fair amount of exposure to NIM even though it is buried one layer below PSSP.

One key feature of NIM that will help manage a group of servers concurrently is the Machine Group definition. Within NIM, you can operate as easily on a single machine as you can a group of machines. For instance, we have defined several machine groups within our NIM master environment. These definitions allow us to operate on a group of like servers concurrently.

How Does It Integrate with Alternate Disk Installation?

NIM knows how to fully exploit Alternate Disk Installation. For example, look at the initial clone and update_all operation. Let's say you want to use NIM to extend the model (instead of upgrading the maintenance level on a single server) and you want to complete this operation on ten Lotus Notes servers that are similarly configured and are defined in a Notes machine group within NIM. From SMIT on the NIM master, issue the following fast path and you will see this panel:

smitty nim_alt_clone

Clone the rootvg to an Alternate Disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

* Target Machine / Group to Install [NOTES] +

* Target Disk(s) to install [hdisk2 hdisk3]

Phase to execute all +

IMAGE_DATA resource [] +/

EXCLUDE_FILES resource [] +/

(leave blank to include all files in backup)

BUNDLE to install [] +

-OR-

Fileset(s) to install []

FIX_BUNDLE to install [] +

-OR-

FIXES to install [update_all]

LPP_SOURCE [aix433_lppsource] +

(required if filesets, bundles or fixes used)

installp Flags

COMMIT software updates? yes +

SAVE replaced files? no +

AUTOMATICALLY install requisite software? yes +

EXTEND filesystems if space needed? yes +

OVERWRITE same or newer versions? no +

VERIFY install and check file sizes? no +

Customization SCRIPT resource [] +/

Set bootlist to boot from this disk

on next reboot yes +

Reboot when complete? no +

Verbose output? no +

Debug output? no +

Group controls (only valid for group targets):

Number of concurrent operations [] #

Time limit (hours) [] #

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

In this example, you would cause every server defined in the Notes Machine group to begin a process to clone itself from hdisk0/1 to hdisk2/3. At the completion of the cloning operation, NIM would then NFS-mount the aix433_lppsource resource (in this case, it's the AIX 4.3.3 lppsource filesystem, which includes the 4.3.3.06 maintenance) and apply it to the newly cloned hdisk2/3 on each of these servers. This also instructs NIM to change the bootlist on each of these servers as a part of the operation but does not cause an immediate reboot. I recommend, however, using NIM to schedule a reboot of all these servers during the maintenance window.

All of this work, including the cloning and upgrading of the maintenance level, can be completed during the day without affecting the running application (e.g., Notes). For the previous support model, this same upgrade would have taken about 2 hours per server plus reboot time to complete during an application maintenance window, generally in the middle of the night. If a single person worked to complete this process, this could have taken about 25 hours spread across multiple weekends to complete. With NIM and Alternate Disk Installation, this upgrade outage can be reduced to the time to reboot these 10 servers concurrently (or about 30 minutes, in our case). Note that your time may vary depending on speed of network, number of filesets being updated, time to reboot, and problems encountered.

Figure 2 shows the process using NIM/Machine Groups and Alternate Disk Installation. First, you instruct the NIM master to have each of the servers in the defined machine group clone hdisk0/1 to hdisk2/3 (depicted in red). Then, NIM will NFS-mount the appropriate LPPSOURCE filesystem containing the AIX 4.3.3.06 maintenance level and apply that maintenance to the newly cloned drives (operation in green). Again, this process happens concurrently on all servers in the defined NIM machine group without affecting the running applications.

Conclusion

My team is in the process of rolling out this methodology change. I think we can significantly reduce the amount of time spent in support of our current AIX standalone infrastructure. I also think Alternate Disk Installation and NIM, can help you better manage your infrastructure and provide some consistency to your installation, upgrade, maintenance, and build procedures. In conclusion, I hope the above discussion will help you significantly reduce the amount of off-hours time associated with maintenance or fileset upgrades within AIX.

Jeff Marsh is the Systems Advisor to the UNIX Server Team working at American Century Investments, a premier investment manager serving nearly two million individual and institutional investors. Jeff can be contacted at: [email protected].



Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.