Friday, December 4, 2009

Splitting Directives - Modular "Molecule" Configurations of Apache httpd

At my 9 to 5, I am currently wearing a sys admin hat (this is something between a welder's mask and sherlock holmes' tweed cap, I think). I love process automation (the foundation of programming, I'd say) so everything is done with a bash script for maximum reuse.

My vision is to be able to rebuild a new version of my httpd configurations and deploy them to the target servers in one command (the testing process happens BEFORE the final build and deploy, silly!)

Why? One of the problems that I've seen in past projects is lack of configuration management and revision control for infrastructure applications like httpd or websphere. Worse, different tiers in the enterprise (boldly going, anyone?) and different servers might have different configurations because of lack of strictness in implementing changes methodically.



Awesome sauce local geek-about-town Jason Leveille turned me on to a tool called Puppet by Reductive Labs. I am impressed with the featureset that I have seen, but alas, the method of delivery can't be used for my purposes (client daemons running on all the boxes).

If you have complete control of your deploy environment and want to solve lots of other problems besides automating and modularizing updates/builds of Apache httpd, I think that you should definitely look at Puppet. I will likely be diving deeper even without having a clear need right now that it can fill.

Enough about Puppet, though. This post is called Splitting Directives and there hasn't been one mention of httpd.conf yet! What gives? Well if you've ever met me I'm either listening and agreeing with what you say or I'm telling it the long way. Life is a journey, right? Forgive me my exposition and proceed if you want to know the method I cooked up for splitting common and specific Apache httpd.conf directives and for building and deploying them.

First, you have to reorganize your httpd.conf file and bring related directives close together. Good examples of this are that you would move entries that refer to virtual host docroots into the VirtualHost directives. You would put the Listen and NamedVirtualHost directives right before the VirtualHost block that they apply to. Move IfModule blocks to occur right after the module is loaded (be careful if the stuff in the IfModule has multiple dependencies). That kind of thing.

Next, mark up with comments how you can split the httpd.conf up into multiple files. Put a start and end for each section you think is "modular molecule".

Split the httpd.conf using the start and end markers into separate files, put the new files in a subdirectory called "common" or similar underneath /apacheroot/conf. Replace (or comment out and replace) the original directives in the httpd.conf file with Include directives that match your "modular molecule" filenames.

For example, here is a very basic version of the httpd.conf I am currently working with:


# ServerRoot, ServerName, DocumentRoot, ServerAdmin directives
Include conf/common/main.conf
# All the LoadModule directives
Include conf/common/modules.conf
# LoadModule and WebspherePluginConfig directives 
Include conf/common/websphere.conf
# User and Group directives, with IfModule blocks around them
Include conf/common/runas.conf
# VirtualHost blocks and associated Listen and NamedVirtualHost records, one per file.
# Remember you can repeat directives like Listen and NamedVirtualHost with the same value without harm,
# So make it one VirtualHost block per file even if they share an IP and port.
# For extra "security" you can explicitly include these, or include one file at conf/common/vhosts.conf that specifies all the others.
Include conf/common/vhosts/*.conf

OK, so at this point we just have a bunch of files and a bunch of includes and there's not much to it. Certainly it can't be used across the enterprise in this form, no matter how fancy the deployment script is. The configuration is just not suitable for more than one computer (unless they all use the same hostnames and all IPs are bound the same, etc. etc.). I am assuming that like me, your configurations should all be substantially the same and that there are only a few things that you need to modify per-host. Things like IPs and hostnames.

That is what the next step addresses. We create another folder underneath /apacheroot/conf called "config" or similar. We now go into each of the files we created earlier, and reorganize them according to whether the directive is a specific one or a common one. The difference between the previous method of organization and this one? This time, wrap the specific directives around the common ones. Where common and specific directives occur in the same file, create a heirarchy with common directives at the core.

We're running a decorator pattern on the configurations ... the common directives are wrapped and decorated by the specific directives. Specific directives go into the conf/config folder and common directives stay in the common folder, albeit in an "unwrapped" (and probably unusable) state.

Edit your httpd.conf to reflect where you have included "modules" that are only specific directives, or wrap specific directives. Omitting the comments for brevity, you now have something like this:


Include conf/config/main.conf
Include conf/common/modules.conf
Include conf/config/websphere.conf
Include conf/config/runas.conf
Include conf/config/vhosts/*.conf

A sample vhost.conf might look like this:


Listen 127.0.0.1
NamedVirtualHost 127.0.0.1:80
<VirtualHost 127.0.0.1:80>
   ServerName localhost
   ServerAlias local.dev mysite.dev mysite.local foo.bar
   Include conf/common/vhosts/sample.vhost.conf   
</VirtualHost>

Run apachectl -t from bin to check your configuration. Working? Awesome. Tweak as necessary if not.

This is a configuration good for one single host, but all of the specific directives have been isolated on the filesystem to the "conf/config" directory. We can now use whatever means we want to organize multiple configurations on the filesystem -- based on the same common directives with similar specific directives.

At the simplest, we just copy the config directory and make a different configuration there. We started with a "dev" configuration we use on our localhost, but what about when we want to deploy this out onto the QA server(s)?


cp conf/config conf/qa1
cp conf/config conf/qa2

We need to store these in version control for later use. And that is going to factor in to the deployment method. This project is using svn so I'd run these commands, too:


svn add conf/qa1
svn add conf/qa2
svn commit conf -m "Adding configurations for qa1 and qa2"

Now you have to go in and edit them to have the right IPs and hostnames and what-not and then commit changes after.

On your buildserver, you should have all these configurations checked out somewhere local. Let's say in /build/apache/httpd/modular/ and that the source code repository is at svn://localhost/apache . You should also have a working area outside of SVN where you can build a configuration locally before you deploy it across the wire to a target. Something like /build/apache/httpd/configs

The build script is pretty simple. Designed to be run from /build/apache/httpd and for there to be directories "modular" and "configs" for the configurations and build directories:


#! /bin/bash
# $1 config to build
SVNREPO="svn://localhost/apache"
CONFIG="$1"

# make directory for configuration
mkdir -p configs/$CONFIG

# export the common directive files to the build dir for this config
svn export $SVNREPO/common configs/$CONFIG/common
# export the configuration specific files to the build dir for this config as "conf/config"
svn export $SVNREPO/$CONFIG configs/$CONFIG/config

Run this script with the parameter "qa1" and there is now a functional configuration for your qa1 server at /build/apache/httpd/configs/qa1. There are a lot of ways to deploy this configuration to a target. I am currently using scp directly but you could just as easily compress the config with tar or another program and move it across as one file.

Example:

#! /bin/bash
# $1 config $2 user $3 host $4 apache root path on host
CONFIG="$1"
USER="$2"
HOST="$3"
PATH="$4"

scp -r configs/$CONFIG/* $USER@$HOST:$PATH/conf

# test and then restart apache here too?
# ssh $USER@$HOST "$PATH/bin/apachectl -l"
## should check the response above before proceeding ... you can roll that yourself :)
# ssh $USER@$HOST "$PATH/bin/apachectl graceful-restart"

There are a lot of configuration directives that are tied to a particular site or a particular layer of the enterprise (dev, qa, stage, production, etc.) and not just the specific server. The method given above can be extended to allow for more complex heirarchies of configuration to be built. Simply repeat the steps but consider commonalities between "stage1" and "stage2" that have analogs in "prod" layer as well. Separate them from the "config" level as you did with the "common" directives. Make a template set as "layer" or "mode" or whatever you think of the steps in the deployment cycle as. Copy that template set on the filesystem as you did with the "config" folder.

Consider the best way to nest your files for your use when modifying configurations. All of the intermediary levels between the most specific and the most general directives need to stay as closely in sync as they can for this system to be useful in avoiding misconfigurations and configuration entropy.

I think a good general rule would be that each server has a "site" role (when there are multiple httpd installs for instance), a "layer" role, and a "server" role. That would correspond to a structure like this:


conf/common
conf/site (template for conf/sites)
conf/sites/site1
conf/sites/site2
conf/layer (template for conf/layers)
conf/layers/layer1
conf/layers/layer2
conf/server (template for conf/servers)
conf/servers/server1
conf/servers/server2

The build script then must be adapted to take each of these into account when reducing the set of configurations down into the configuration needed for this particular host.


#! /bin/bash
# $1 server to build
# $2 layer to build
# $3 site to build
SVNREPO="svn://localhost/apache"
SERVER="$1"
LAYER="$2"
SITE="$3"

BUILDDIR="$SITE/$LAYER/$SERVER"

# make directory for configuration
mkdir -p configs/$BUILDDIR

# export the common directive files to the build dir for this config
svn export $SVNREPO/common configs/$BUILDDIR/common
# export the server specific files to the build dir for this config as "conf/server"
svn export $SVNREPO/servers/$SERVER configs/$BUILDDIR/server
svn export $SVNREPO/layers/$LAYER configs/$BUILDDIR/layer
svn export $SVNREPO/sites/$SITE configs/$BUILDDIR/site

Each site may have different configurations for each layer. If the way that the Include statements in your configuration files works there are lots of levels of heirarchy, you may wish to use a heirarchical folder structer. That is the fundamental model that I used, but it is more complex to illustrate conceptually and so I will leave illustrating it to a later post (if there is demand).

I hope this post helps someone else who is looking for information on build scripts, modular configurations, modular directives, magic modular molecules or general httpd common configuration tomfoolery. Good luck and take care! Considered actions are the only defensible ones.

No comments:

Post a Comment