Wednesday, September 17, 2014

Making better use of libvirt hooks

Libvirtd includes handy hooks for doing management work at various phases in the lifecycle of the libvirt daemon, attached networks, and virtual machines. I've been using these hooks for various things and have found them particularly useful for management of short-lived Linux containers. Some of my use cases for these hooks include:
  • changing network policy
  • instantiating named routing tables
  • creating ramdisks for use by containers 
  • pre-loading data before container startup
  • archiving interesting data at container shutdown 
  • purging data at container destruction

Here's how the hooks work on a system with RedHat lineage:

The hook scripts live in /etc/libvirt/hooks. The scripts are named according to their purpose. I'm focusing right now on the LXC hook which is named /etc/libvirt/hooks/lxc. Note that neither the directory, nor the scripts exist by default.

The lxc script is called several times in each container's lifecycle, and is passed arguments that specify the libvirt domain id and the lifecycle phase. During startup and shutdown of one of my LXC systems, the script gets called five times, like this:

 /etc/libvirt/hooks/lxc MyAwesomeContainer prepare begin -  
 /etc/libvirt/hooks/lxc MyAwesomeContainer start begin -  
 /etc/libvirt/hooks/lxc MyAwesomeContainer started begin -  
 /etc/libvirt/hooks/lxc MyAwesomeContainer stopped end -  
 /etc/libvirt/hooks/lxc MyAwesomeContainer release end -  

In addition to having those command line arguments, each time the script is run, it receives the guest's entire libvirt definition (XML) on STDIN.

My script was parsing the command line arguments to figure out the domain ID and lifecycle phase, then calling countless modules to do different tasks depending on the context. It also read in some external configuration files which specified various parameters for each domain.

The script quickly became an absolute monster. It was doing too much, had too many dependencies, too many modules baked in, and was hard to test without interrupting all of the guest systems.

Two changes to the approach brought this facet of guest management under control:

Change #1: Run-parts Style
The first thing I did was to rip all of the smarts out of the script. All it does now is call other scripts in a manner similar to run-parts. It looks something like this:

 # Collect input from STDIN, stick it in $DATA  
 # Rip the domain ID out of $*  
 DOMAIN=$1; shift  
 # Join the remaining command line bits with '_' chars, then add '.d'  
 DIR=${0}_$(/bin/echo "$*" | /bin/sed -e 's/ /_/g' -e 's/$/.d/')  
 # Run all numbered-and-executable scripts in DIR with the usual CLI arguments,  
 # passing data collected on our STDIN to the script's STDIN.  
 for script in $DIR/[0-9]* ; do  
  if [ -x $script ] ; then  
   /bin/echo -n "$DATA" | $script $DOMAIN $*  

It now weighs in at a svelte 8 lines. It uses its own name ($0 is "lxc" in this case) and the passed command line arguments (not the first one which identifies the domain ID) to divine the name of a directory full of other scripts which need to be run. Then, it runs whatever scripts it finds in that directory. So, when it's called like this:
 /etc/libvirt/hooks/lxc MyAwesomeContainer prepare begin -  

It runs all of the scripts it finds in:

Those scripts must be executable, and the must be named with a leading digit to specify run order. It's an alphabetic sort, so consider using leading zeros: 11 sorts before 5, but not before 05.

When the new wrapper calls the child scripts, it uses the same command line arguments, and feeds the same data on STDIN. There's an opportunity for things to go weird if libvirt used arguments with spaces in them, but it doesn't do that, so meh.

Now, rather than having a section of one big script that looks like this:
 DOMAIN=$1; shift  
 case "$*" in  
     "prepare begin -" )  
         mkdirs_function $DOMAIN
         setup_ramdisk_function $DOMAIN
         preload_data_function $DOMAIN
     "release end -" )  
         archive_data_function $DOMAIN
         purge_data_function $DOMAIN
         destroy_ramdisks_function $DOMAIN

There is a directory structure that looks like this:
              ├── lxc  
              ├── lxc_prepare_begin_-.d  
              │   ├──  
              │   ├──  
              │   └──  
              └── lxc_release_end_-.d  

Each of those scripts operates independently, and has all of the data it needs on the command line and STDIN. They run in order.

If I discover that I need to do something in the "started begin -" lifecycle phase, it's as easy as creating the /etc/libvirt/hooks/lxc_started_begin_-.d directory and dropping some new scripts in there.

Change #2: Libvirt Definition Supports Metadata
The second change is that I began using the libvirt guest definition to specify exactly what work needs to be done by the hook scripts. The various functions (create ramdisk, preload data, etc...) in my old script had been using configuration files, patterns based on the domain ID and so forth. This way is much better.

The libvirt XML schema is quite strict, but it does allow a <metadata> element which will cart around anything you care to define. I use it to specify ramdisk mount points and sizes, pre-load and post-cleanup stuff, etc...

For example, there are some guests which have directories that I don't want to persist after the guest container is destroyed. The libvirt definition for those guests includes the following:

This blob of XML does nothing by itself. Libvirt totally ignores it. The action happens when /etc/libvirt/hooks/lxc_release_end_-.d/ gets called.

That guy looks for <dir> elements in <metadata><post-rm-rf>, and removes the specified directories. It's named '02' to ensure that it runs after, which squirrels away logfiles and other data that I might want to retain.

Because this script receives the libvirt definition XML on STDIN, the information about what to remove travels with the guest. On shutdown of containers that don't include the <post-rm-rf> element, nothing happens.

Here's what's in
 #!/usr/bin/env python  

 def xml_from_stdin():  
   import sys  
   import xml.etree.ElementTree as ET  
   tree = ET.parse(sys.stdin)  
 def rm-rf(dir):  
   import shutil  
   shutil.rmtree(dir, ignore_errors=True)  
 def main():  
   root = xml_from_stdin()  
   for dir in root.find('metadata').findall('post-rm-rf'):  
 if __name__ == "__main__":  

Another example:
The script is a little more complicated. Here's the relevant XML:

There are two things going on inside the <archive> element. This first are the <dircopy> bits, which specify data to keep, and where to put it.

The second are the <datestr> bits, which specify strftime format strings, and the related strings in the destination path which should be replaced by an appropriately formatted timestamp.

Now, any <dst> element which includes the string %DATE% will find that string replaced with the current date, formatted like this: 2014-09-17. Similarly, any <dst> element which includes the string blahblahblah will find blahblahblah replaced by a nicely formatted timestamp. There's nothing magic about the percent symbols, the <replace> element can be any string.

The script is here.

Final Thoughts
My guest management is much more agile with libvirt carrying around extra instructions in <metadata> and the modularity afforded by run-parts style hooks functions.

Each time I want a new thing to happen to a guest, I need to do three things:
  • Invent some XML describing specifics of the operation, jam it into <metadata>
  • Write a script which parses the XML, implements the new feature
  • Drop that script onto all of my libvirt nodes in the appropriate lifecycle_args.d directory
One gotcha which has tripped me up more than once: libvirtd might not notice that the hooks script has been installed (or edited?) until libvirtd gets restarted. Maybe I'll remember this tidbit after having blogged about it. [Edit: Nope. I was just surprised to read this section from an earlier draft.]

1 comment:

  1. Very helpful post Chris, thanks!
    For anyone else following this that finds the metadata is not being passed as expected within the domains XML.
    Since 0.9.10, you must provide custom namespaces.