Showing posts with label Live Upgrade. Show all posts
Showing posts with label Live Upgrade. Show all posts

Thursday, February 17, 2011

Live Upgrade to install the recommended patch cluster on a ZFS snapshot

Live Upgrade used to require that you find some free slices (partitions) and then fidget with the -R "alternate Root" options to install the patch cluster to an ABE. With ZFS all of those pains have just ... gone away ...

Nowadays Live Upgrade on ZFS don't even copy the installation, instead it automatically clones a snapshot of the boot environment, saving much time and disk space! Even the patch install script is geared towards patching an Alternate Boot Environment!

The patching process involves six steps:

  1. Apply Pre-requisite patches
  2. Create an Alternate Boot Environment
  3. Apply the patch cluster to this ABE
  4. Activate the ABE
  5. Reboot
  6. Cleanup

Note: The system remains online throughout all except the reboot step.

In preparation you uncompress the downloaded patch cluster file. I create a zfs file system and mounted it on /patches, and extracted the cluster in there. Furthermore, you have to read the cluster README file - it contains a "password" needed to install, and information about pre-requisites and gotches. Read the file. This is your job!

The pre-requisites are essentially just patches to the patch-add tools, conveniently included in the Patch Cluster!

Step 1 - Install the pre-requisites for applying the cluster to the ABE

# cd /patches/10_x86_Recommended
# ./installcluster --apply-prereq

Note - If you get an Error due to insufficient space in /var/run, see my previous blog post here!

Step 2 - Create an Alternate boot environment (ABE)

# lucreate -c s10u9 -n s10u9patched -p rpool

Checking GRUB menu...
Analyzing system configuration.
No name for current boot environment.
Current boot environment is named <s10u9>.
Creating initial configuration for primary boot environment <s10u9>.
The device </dev/dsk/c1t0d0s0> is not a root device for any boot environment; cannot get BE ID.
PBE configuration successful: PBE name <s10u9> PBE Boot Device </dev/dsk/c1t0d0s0>.
Comparing source boot environment <s10u9> file systems with the file
system(s) you specified for the new boot environment. Determining which
file systems should be in the new boot environment.
Updating boot environment description database on all BEs.
Updating system configuration files.
Creating configuration for boot environment <s10u9patched>.
Source boot environment is <s10u9>.
Creating boot environment <s10u9patched>.
Cloning file systems from boot environment <s10u9> to create boot environment <s10u9patched>.
<B>Creating snapshot</B> for <rpool/ROOT/s10_0910> on <rpool/ROOT/s10_0910@s10u9patched>.
<B>Creating clone</B> for <rpool/ROOT/s10_0910@s10u9patched> on <rpool/ROOT/s10u9patched>.
Setting canmount=noauto for </> in zone <global> on <rpool/ROOT/s10u9patched>.
Saving existing file </boot/grub/menu.lst> in top level dataset for BE <s10u9patched> as <mount-point>//boot/grub/menu.lst.prev.
File </boot/grub/menu.lst> propagation successful
Copied GRUB menu from PBE to ABE
No entry for BE <s10u9patched> in GRUB menu
Population of boot environment <s10u9patched> successful.
Creation of boot environment <s10u9patched> successful.

There is now an extra boot environment to which we can apply the Patch Cluster. Note - for what it is worth, if you just needed a test environment to play in, you can now luactivate the alternate boot environment and then make any changes to the active system. If the system breaks, all it takes to undo any and all changes is a reboot.

Step 3 - Apply the patch cluster to the BE named s10u9patched.

# cd /patches/10_x86_Recommended
# ./installcluster -B s10u9patched

I am not showing the long and boring output from the installcluster script as this blog post is already far too long. The patching runs for quite a while, plan for at least two hours. Monitor the process and check the log for warnings. Depending on how long it has been since the last patches were applied, some severe patches may be applied which can affect your ability to login after rebooting. Again: READ the README!

Step 4 - Activate the ABE.

# luactivate s10u9patched
System has findroot enabled GRUB
Generating boot-sign, partition and slice information for PBE <s10u9>
A Live Upgrade Sync operation will be performed on startup of boot environment <s10u9patched>.

Generating boot-sign for ABE <s10u9patched>
Generating partition and slice information for ABE <s10u9patched>
Copied boot menu from top level dataset.
Generating multiboot menu entries for PBE.
Generating multiboot menu entries for ABE.
Disabling splashimage
Re-enabling splashimage
No more bootadm entries. Deletion of bootadm entries is complete.
GRUB menu default setting is unaffected
Done eliding bootadm entries.

**********************************************************************

The target boot environment has been activated. It will be used when you
reboot. NOTE: You MUST NOT USE the reboot, halt, or uadmin commands. You
MUST USE either the init or the shutdown command when you reboot. If you
do not use either init or shutdown, the system will not boot using the
target BE.

**********************************************************************

In case of a failure while booting to the target BE, the following process
needs to be followed to fallback to the currently working boot environment:

1. Boot from the Solaris failsafe or boot in Single User mode from Solaris
Install CD or Network.

2. Mount the Parent boot environment root slice to some directory (like
/mnt). You can use the following commands in sequence to mount the BE:

     zpool import rpool
     zfs inherit -r mountpoint rpool/ROOT/s10_0910
     zfs set mountpoint=<mountpointName> rpool/ROOT/s10_0910
     zfs mount rpool/ROOT/s10_0910

3. Run <luactivate> utility with out any arguments from the Parent boot
environment root slice, as shown below:

     <mountpointName>/sbin/luactivate

4. luactivate, activates the previous working boot environment and
indicates the result.

5. Exit Single User mode and reboot the machine.

**********************************************************************

Modifying boot archive service
Propagating findroot GRUB for menu conversion.
File </etc/lu/installgrub.findroot> propagation successful
File </etc/lu/stage1.findroot> propagation successful
File </etc/lu/stage2.findroot> propagation successful
File </etc/lu/GRUB_capability> propagation successful
Deleting stale GRUB loader from all BEs.
File </etc/lu/installgrub.latest> deletion successful
File </etc/lu/stage1.latest> deletion successful
File </etc/lu/stage2.latest> deletion successful
Activation of boot environment <s10u9patched> successful.

# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
s10u9                      yes      no     no        yes    -
s10u9patched               yes      yes    yes       no     -

Carefully take note of the details on how to recover from a failure. Making a hard-copy of this is not a bad idea! Take note that you have to use either init or shutdown to effect the reboot, as the other commands will circumvent some of the delayed action scripts! Hence ...

Step 5 - Reboot using shutdown or init ...

# init 6

Monitor the boot-up sequence. A few handy commands while you are performing the upgrade, includes:

# lustatus
# bootadm list-menu
# zfs list -t all

You will eventually (after confirming that everything works as expected) want to free up the disk space held by the snapshots. The first command cleans up the redundant Live Upgrade entries as well as the relevant ZFS snapshot storage! The second is to remove the temporary ZFS file system used for the patching.

Step 6 - Cleanup

# ludelete s10u9
# zfs destroy rpool/patches

Again no worries about where the space comes from. ZFS simply manages it! Live Upgrade takes care of your grub boot menu and gives you clear instructions on how to recover it anything goes wrong.

Tuesday, July 1, 2008

Let ZFS manage even more space more eficiently

The idea of using ZFS to manage process core dumps begs to be expanded to at least crash dumps. This also enters into the realm of Live Upgrade as it eliminates the need to sync potentially a lot of data on activation of a new BE!

Previously I created a ZFS file system in the root pool, and mounted it on /var/cores.

The same purpose would be even better served with a generic ZFS file system which can be mounted on any currently active Live-Upgrade boot environment. The discussion here suggests the use of a ZFS file system rpool/var_shared, mounted under /var/shared. Directories such as /var/crash and /var/cores can then be moved into this shared file system.

So:

/ $ pfexec ksh -o vi
/ $ zfs create rpool/var_shared
/ $ zfs set mountpoint=/var/shared rpool/var_shared
/ $ mkdir -m 1777 /var/shared/cores
/ $ mkdir /var/shared/crash
/ $ mv /var/crash/`hostname` /var/shared/crash

View my handiwork:

/ $ ls -l /var/shared

total 6
drwxrwxrwt   2 root     root           2 Jun 27 17:11 cores
drwx------   3 root     root           3 Jun 27 17:11 crash
/ $ zfs list -r rpool
NAME                       USED  AVAIL  REFER  MOUNTPOINT
rpool                     13.3G  6.89G    44K  /rpool
rpool/ROOT                10.3G  6.89G    18K  legacy
rpool/ROOT/snv_91         5.95G  6.89G  5.94G  /.alt.tmp.b-b0.mnt/
rpool/ROOT/snv_91@snv_92  5.36M      -  5.94G  -
rpool/ROOT/snv_92         4.33G  6.89G  5.95G  /
rpool/dump                1.50G  6.89G  1.50G  -
rpool/export              6.83M  6.89G    19K  /export
rpool/export/home         6.81M  6.89G  6.81M  /export/home
rpool/swap                1.50G  8.38G  10.3M  -
rpool/export/cores          20K  2.00G    20K  /var/cores
rpool/var_shared            22K  3.00G    22K  /var/shared

Just to review the current settings for saving crash dumps:

/ $ dumpadm

      Dump content: kernel pages
       Dump device: /dev/zvol/dsk/rpool/dump (dedicated)
Savecore directory: /var/crash/solwarg
  Savecore enabled: yes

Set it to use the new path I made above:

/ $ dumpadm -s /var/shared/crash/`hostname`

      Dump content: kernel pages
       Dump device: /dev/zvol/dsk/rpool/dump (dedicated)
Savecore directory: /var/shared/crash/solwarg
  Savecore enabled: yes

Similarly I update the process core dump settings:

/ $ coreadm -g /var/shared/cores/core.%z.%f.%u.%t
/ $ coreadm

     global core file pattern: /var/shared/cores/core.%z.%f.%u.%t
     global core file content: default
       init core file pattern: core
       init core file content: default
            global core dumps: disabled
       per-process core dumps: enabled
      global setid core dumps: enabled
 per-process setid core dumps: disabled
     global core dump logging: enabled

And finally, some cleaning up:

/ $ zfs destroy rpool/export/cores
/ $ cd /var
/var $ rmdir crash
/var $ ln -s shared/crash
/var $ rmdir cores

As previously, the above soft link is just in case somewhere there is a naughty script or tool with a hard coded path to /var/crash/`hostname`. I don't expect to find something like that in oficially released Sun software, but I do some times use programs not officially released or supported by Sun.

This makes me wonder what else can I make it do! I'm looking forward to my next Live Upgrade to see how well it preserves my configuration before I attempt to move any of the spool directories from /var to /var/shared!



Monday, June 30, 2008

Use Live upgrade already

If you are still not using live upgrade, you need to make learning it a priority. It will save you hours and make your manager happy because it costs nothing and gives you a simple, good and fast method for regressing your changes. You just need a few (about 10) GB of free disk space, be it in your root ZFS pool, on an un-used disk, or even any slice on any disk in the system.

The Live Upgrade concept is simple: make a copy of your "boot environment", run the upgrade or patching against this copy (called the alternative boot environment), and finally activate it.

Creation of the new boot environment is done by running a few simple commands which copies and updates the files in the new boot environment, an operation that can (and does) take a considerable amount of time, but runs in the back ground while the system is up and running, with all services online and active.

The Live Upgrade commands comes from three packages that you should install from the target OS's install media - for example if you want to upgrade from Solaris 9 to Solaris 10, you install SUNWlucfg, SUNWluu and SUNWlur from the Solaris 10 media (or run the liveupgrade20 install script in Tools/Installers directory)

Then once this is completed, another command (luactivate) is run to confirm that the new boot environment must be activated on the next reboot. On SPARC systems, this process modifies the boot-device in the OBP, while on i386 systems it updates Grub with a new "default".

Then all that is left is to do the actual reboot. During the reboot some special files and directories will be synchronized one last time - this is because between the time the system was copied over to the clone, and the time when the reboot runs, various things could possibly change: People still log in and change their passwords, recieve and send mail, spool jobs to the printers, etc. The administrator could even create new login accounts! To deal with this, Live Upgrade will synchronize a pre-determined list of files and directories during the first boot of the new boot environment.

The list of files copied is available here, and can be customized by editing the /etc/lu/synclist file.

The Live Upgrade system has got the intelligence built in to allow the new boot environment to find the files in the old boot environment during the boot up process, so this is completely automatic.

Recent Solaris Express installations prepare for the use of live upgrade by automatically setting up a slice and mounting is as "/second_root", but you need to unmount it and remove it from /etc/vfstab before live upgrade will allow you to use it. If you don't have a free slice, make one (backup /export, unmount it, and create two smaller slices in its place, one for live upgrade and one to restore /export to). This will be cheaper than performing upgrades the traditional way.