Thursday, June 19, 2008

Using a dedicated ZFS file system to manage process core dumps

ZFS just bristles with potential. Quotas, Reservations, turning compression or atime updates on or off without unmounting. The list goes on.

So now that we have ZFS root (Since Nevada build SNV_90, and even earlier when using OpenSolaris or other distributions) lets start to make use of these features.

First thing is, on my computer I don't care about access time updates on files or directories, so I disable it.

/ $ pfexec zfs set atime=off rpool

That is not particularly spectacular in itself, but since it is there I use it. The idea is of course to save a few disk updates and the corresponding IOs.

Next: core dumps. One of my pet hates. Many processes dumps core in your home dir, these get overwritten or forgotten, and then there are any number of core files lying around all over the file systems, all off these just wasting space since I don't really intent do try to analyze any of them.

Solaris has got a great feature by which core dumps can be all directed to go to a single directory and, on top of that, to have more meaningful file names.

So the idea is to create a directory, say /var/cores and then store the core files in there for later review. But knowing myself these files will just continue to waste space until I one day decide to actually try and troubleshoot a specific issue.

To me this sounds like a perfect job for ZFS.

First I check that there is not already something called /var/cores:

/ $ ls /var/cores
/var/cores: No such file or directory

Great. Now I create it.

/ $ pfexec zfs create rpool/export/cores
/ $ pfexec zfs set mountpoint=/var/cores rpool/export/cores

And set a limit on how much space it can ever consume:

/ $ pfexec zfs set quota=2g rpool/export/cores

Note: This can easily be changed at any time, simply by setting a new quota.

Which creates the below picture.

/ $ df -h
Filesystem size used avail capacity Mounted on
rpool/ROOT/snv_91 20G 5.9G 7.0G 46% /
/devices 0K 0K 0K 0% /devices
/dev 0K 0K 0K 0% /dev
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 2.3G 416K 2.3G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
sharefs 0K 0K 0K 0% /etc/dfs/sharetab /usr/lib/libc/
13G 5.9G 7.0G 46% /lib/
fd 0K 0K 0K 0% /dev/fd
swap 2.3G 7.2M 2.3G 1% /tmp
swap 2.3G 64K 2.3G 1% /var/run
rpool/export 20G 19K 7.0G 1% /export
rpool/export/home 20G 6.8M 7.0G 1% /export/home
rpool 20G 44K 7.0G 1% /rpool
rpool/export/cores 2.0G 18K 2.0G 1% /var/cores
SHARED 61G 24K 31G 1% /shared
... snip ...

And checking the settings on the /var/cores ZFS file system

/ $ zfs get all rpool/export/cores
rpool/export/cores type filesystem -
rpool/export/cores creation Thu Jun 19 14:18 2008 -
rpool/export/cores used 18K -
rpool/export/cores available 2.00G -
rpool/export/cores referenced 18K -
rpool/export/cores compressratio 1.00x -
rpool/export/cores mounted yes -
rpool/export/cores quota 2G local
rpool/export/cores reservation none default
rpool/export/cores recordsize 128K default
rpool/export/cores mountpoint /var/cores local
rpool/export/cores sharenfs off default
rpool/export/cores checksum on default
rpool/export/cores compression off default
rpool/export/cores atime off inherited from rpool
rpool/export/cores devices on default
rpool/export/cores exec on default
rpool/export/cores setuid on default
rpool/export/cores readonly off default
rpool/export/cores zoned off default
rpool/export/cores snapdir hidden default
rpool/export/cores aclmode groupmask default
rpool/export/cores aclinherit restricted default
rpool/export/cores canmount on default
rpool/export/cores shareiscsi off default
rpool/export/cores xattr on default
rpool/export/cores copies 1 default
rpool/export/cores version 3 -
rpool/export/cores utf8only off -
rpool/export/cores normalization none -
rpool/export/cores casesensitivity sensitive -
rpool/export/cores vscan off default
rpool/export/cores nbmand off default
rpool/export/cores sharesmb off default
rpool/export/cores refquota none default
rpool/export/cores refreservation none default

Note that Access-time updates on this file system is off - the setting has been inherited from the pool. The only "local" settings are the mountpoint and the quota which corresponds to the items that I've specified manually.

Now just to make new core files actually use this directory. At present, the default settings from coreadm looks like this:

/ $ coreadm
global core file pattern:
global core file content: default
init core file pattern: core
init core file content: default
global core dumps: disabled
per-process core dumps: enabled
global setid core dumps: disabled
per-process setid core dumps: disabled
global core dump logging: disabled

Looking at the coreadm man page, there is a fair amount of flexibility in what can be done. I want core files to have a name identifying the zone in which the process was running, the process executable file, and the user. I also don't want core dumps to overwrite when the same process keeps on faulting, so I will add a time stamp to the core file name.

/ $ pfexec coreadm -g /var/core/core.%z.%f.%u.%t

And then I would like to enable logging of an event any time when a core file is generated, and also to store core files for Set-UID processes:

/ $ pfexec coreadm -e global-setid -e log

And finally, just to review the core-dump settings, these now look like this:

/ $ coreadm
global core file pattern: /var/core/core.%z.%f.%u.%t
global core file content: default
init core file pattern: core
init core file content: default
global core dumps: disabled
per-process core dumps: enabled
global setid core dumps: enabled
per-process setid core dumps: disabled
global core dump logging: enabled

Now if that is not useful, I don't know what is! You will soon start to appreciate just how much space is wasted and just how truly rigid and inflexible other file systems are once you run your machine with a ZFS root!

No comments: