///Cultured Perl: Automating UNIX System Administration with Perl

Cultured Perl: Automating UNIX System Administration with Perl

A Centralized Configuration File Strategy

UNIX system administration, always a thorny problem, is easier with the right tools. In this installment, Teodor presents ideas on the use of Perl to streamline and foolproof system administration. The system configuration engine, cfengine, is an extremely important tool in this context.

To follow the exercises in this article, you must have Perl 5.6.0 installed on your system. Preferably, your system should be a recent (2000 or later) mainstream UNIX installation (Linux, Solaris, BSD). The examples may work with earlier versions of Perl and UNIX and with and other operating systems, but you should consider their potential failure to function as an exercise to solve.

A big reason that UNIX administration is challenging is that every UNIX vendor believes standards are for weak-minded fools. Thus, even operating systems from the same vendor (SunOS 4.x and Solaris 5.x) can be fundamentally different. In some instances, a vendor doesn’t even exist. Linux, for example, has no single vendor (although Red Hat is currently the biggest Linux distribution), and every subtype of Linux has its own quirks. POSIX standardization is a step in the right direction to solving this problem, when it’s done right. Unfortunately it only guarantees a small subset of the functionality needed for system administration.

As I’ve often said: know your tools. System administration can be a nightmare if you try to do everything with one single tool, language, or approach. Be flexible.

If there is one system administration truism, it is this: no simple sysadmin task is fun more than twice. If you find yourself doing a simple dull task more than twice, automate it. Of course, sometimes it’s hard to automate things, but you should at least consider the option and weigh its advantages against the time you will spend automating it.

The Tool Cfengine

If you are serious about automating system administration, cfengine is a tool you should know. Ignoring cfengine is a viable option only if you like to spend your days in the vi editor.

cfengine is a system configuration engine. It takes configuration scripts as input, and then takes actions based on these scripts. It is currently at version 1.6.3 (a very stable release), and version 2.0 is on the horizon. For more information on cfengine development, visit the cfengine Web site (see Resources later in this article).

You don’t have to use everything cfengine offers, and you will probably not need the whole thing all at once. Your cfengine configuration files should start out simple, and grow as you discover more things that you want automated.

From the cfengine command reference, here are its most notable features:

• File permissions and ACLs can be monitored and fixed. For example, /etc/shadow can be kept with 0400/root/sys permissions, and if those permissions change, you can either warn the system administrator or fix them immediately.

• NFS filesystems can be automatically mounted or unmounted, with the corresponding fstab changes.

• Netmasks, DNS configuration, default routes, and primary network interfaces can be administered through a single file;

• Files and directories can be recursively copied to another location, either locally or from a remote server.

• Files can be edited (this is a very powerful feature, offering regular expressions and global search/replace), rotated (log files, for instance), or deleted.

• Files (singly and/or everything in a directory or matching a regex) and whole directories can be linked.

• Processes can be started, killed, restarted, or sent arbitrary signals based on regular expression matches in the process table.

• Arbitrary commands can be run.

• All of the above can be conditional upon the operating system type and revision, time of day, arbitrary user-defined classes, presence or absence of files, directories, or data in files, and so on.

Even though you can do with Perl all the things that cfengine does, why would you want to reinvent the wheel? Editing files, for instance, can be a simple one-liner if you want to replace one word with another. When you start allowing for system subtypes, logical system divisions, and all the other miscellaneous factors, your one-liner could end up being 300 lines. Why not do it in cfengine, and produce 100 lines of readable configuration code?

From my own experience, introducing cfengine to a site is quite easy, because you can start out with a minimal configuration file and gradually move things into cfengine over time. No one likes sudden change, least of all system administrators (because they will get blamed if anything goes wrong, of course).

Configuration File Management

Managing configuration files is tough. You can start by considering whether cfengine is adequate for the task. Unfortunately, cfengine’s editing is line oriented, so complex configuration files will probably not be a good match for it. But simple files such as the TCP wrappers configuration file /etc/hosts.allow are best done through cfengine.

Usually, you will want to keep more than one version of configuration files. For instance, you may need two sets of DNS configurations in /etc/resolv.conf, one for external, and another for internal machines. The external DNS resolv.conf file could, naturally, go into a directory called “external”, while the internal resolv.conf could go into the corresponding “internal” directory. Let’s assume both directories are under a global “spec” directory, which is a sort of root for configuration files.

The following code will traverse the spec directory, searching for a filename suitable for a given machine. It will start at /usr/local/spec and go down, looking for files that match the one requested. Furthermore, it will check whether or not each directory’s name is the same as the class belonging to some machine. Thus, if we request locate_global(‘resolv.conf’, ‘wonka’), the function will look under /usr/local/spec for files named resolv.conf that are in either the root directory, or in children of the root directory whose names match the classes that the “wonka” machine belongs to. So, if “wonka” belongs to the “chocolate” class, and if there is a /usr/local/spec/chocolate/resolv.conf file, then locate_global() will return “/usr/local/spec/chocolate/resolv.conf”.

If locate_global() finds multiple matching versions of a file (for instance, /usr/local/spec/chocolate/resolv.conf and /usr/local/spec/resolv.conf), it will give up. The assumption is that we are better off with no configuration than with one of the two wrong ones. Also, note that machines can belong to more than one class.

You can build on this structure. For instance,

• /usr/local/spec/external/chocolate/resolv.conf

• /usr/local/spec/internal/chocolate/resolv.conf

• /usr/local/spec/external/sugar/resolv.conf

• /usr/local/spec/internal/sugar

will contain files for external and internal “chocolate” and “sugar” machines. You just have to set up the your machine_belongs_to_class() function correctly.

Once locate_global() returns a file name, it’s pretty simple to copy it to the remote system with scp or rsync. Remember, always preserve the permissions and attributes of the file. Scp needs the “-p” flag, and rsync needs the “-a” flag. Consult the documentation for the file copy command you want to use. And there you have a unified configuration file tree.

Listing 1: Spec directory traversal

# {{{ locate_global: use spec directory to find a file matching the current class

sub locate_global($$)
{
# this code uses File::Find
my $spec_dir = '/usr/local/spec';
my $file = shift || return undef; # file name sought
my $machine = shift || return undef; # machine name
my @matches;
my $find_sub =
sub
{
print "found file $_";
push @matches, $File::Find::name if ($_ eq $file);
# the machine_belongs_to_class sub returns true if a machine
# belongs to a class; we stop traversing down otherwise
$File::Find::prune = 1 unless
machine_belongs_to_class($machine, $_) || $_ eq '.';
};
find($find_sub, $spec_dir);
if (scalar @matches > 1)
{
print "More than one match for file $file,",
"machine $machine found: @matches" ;
return undef;
}
elsif (scalar @matches == 1)
{
return $matches[0]; # this is the right match
}
else
{
return undef; # no files found
}
}
# }}}

One challenge once you set up this sort of /usr/local/spec structure is: how do we know that resolv.conf should go into /etc? You either have to do without the nice hierarchical structure shown here, adapt it (replace “/” with “+”, for instance — a risky and somewhat ugly approach), or maintain a separate mapping between symbolic names and real names. For instance, “root-profile” can be the symbolic name for “~root/.profile”. The last approach is the one I prefer, because it flattens out filenames and eliminates the problem of having hidden filenames. Everything is visible and tidy, under one directory structure. Of course, it’s a little more work every time you add a file to the list. The program has to know that “resolv.conf” should be copied to “/etc/resolv.conf” on the remote system, and “dfstab” should go to “/etc/dfs/dfstab” (the Solaris file for sharing NFS filesystems).

Now let’s talk about what you can do once you have this spec directory hierarchy set up. You could, if you wanted to, look for all the users named Joe:

Listing 2: Find all password files and grep them for Joe

grep Joe `find /usr/local/spec -name passwd`

Or you can use a tool such as rep.pl (link to rep.pl), written by David Pitts, to replace every word with another:

Listing 3: Find all hosts files and change “wonka” to “willy”

find /usr/local/spec -name hosts -exec rep.pl wonka willy {} \;

Now, you can write both Listing 2 and 3 in Perl, if you want; the find2perl utility was written just for that. It’s much simpler, however, to just use find from the start. It really is a wonderful utility that every system administrator should use. More importantly, it took me 5 minutes to write the two listings. How long would it take you to figure out how to use find2perl, store the code it produces in a file, then run that file? Try it and see for yourself!

Task Automation

Task automation is an extremely broad topic. I will limit this section to only simple automation of non-interactive UNIX commands. For automation of interactive commands, Expect is the best tool currently available. You should either learn its syntax, or use the Perl Expect.pm module. You can get Expect.pm from CPAN; see Resources for more details.

With cfengine, you can automate almost any task based on arbitrary criteria. Its functionality, however, is a lot like the Makefile functionality in that complex operations on variables are hard to do. When you find that you need to run commands with parameters obtained from a hash, or through a separate function, it’s usually best to switch to a shell script or to Perl. Perl is probably the better choice because of its functionality. You shouldn’t discard shell scripts as an alternative, though. Sometimes Perl is overkill and you just need to run a simple series of commands.

Automating user addition is a common problem. You can write your own adduser.pl script, ), –> or you can use the adduser program provided with most modern UNIX systems. Make sure the syntax is consistent between all the UNIX systems you will use, but don’t try to write a universal adduser program interface. It’s too hard, and sooner or later someone will ask for a Win32 or MacOS version when you thought you had all the UNIX variants covered. This is one of the many problems that you just shouldn’t solve entirely in Perl, unless you are very ambitious. Just have your script ask for user name, password, home directory, etc. and invoke adduser with a system() call.

Listing 4: Invoking adduser with a simple script

#!/usr/bin/perl -w


use strict;

my %values; # will hold the values to fill in

# these are the known adduser switches
my %switches = ( home_dir => '-d', comment => '-c', group => '-G',
password => '-p', shell => '-s', uid => '-u');

# this location may vary on your system
my $command = '/usr/sbin/adduser ';

# for every switch, ask the user for a value
foreach my $setting (sort keys %switches, 'username')
{
print "Enter the $setting or press Enter to skip: ";
$values{$setting} = ;
chomp $values{$setting};
# if the user did not enter data, kill this setting
delete $values{$setting} unless length $values{$setting};
}

die "Username must be provided" unless exists $values{username};

# for every filled-in value, add it with the right switch to the command
foreach my $setting (sort keys %switches)
{
next unless exists $values{$setting};
$command .= "$switches{$setting} $values{$setting} ";
}

# append the username itself
$command .= $values{username};

# important - let the user know what's going to happen
print "About to execute [$command]";

# return the exit status of the command
exit system($command);

Another task commonly done with Perl is monitoring and restarting processes. Usually, this is done with the Proc::ProcessTable CPAN module, which can go through the entire process table, and give the user a list of processes with many important attributes. Here, however, I must recommend cfengine. It offers much better process monitoring and restarting options than a quick Perl tool does, and if you get serious about writing such a tool, you are just reinventing the wheel (and cfengine is stealing your hubcaps). If you do not want to use cfengine for your own reasons, consider the pgrep and pkill utilities that come with most modern UNIX systems. pkill -HUP inetd will do in one concise command as much as a Perl script four or more lines long. This said, you should definitely use Perl if the process monitoring you are doing is very complex or time sensitive.

For the sake of completeness, here is a Proc::ProcessTable example that shows how to use the kill() Perl function. The “9” as a parameter is the strongest kill() argument, meaning roughly “kill process with extreme prejudice, then feed it to the piranhas.” Do not run this as root, unless you really want to kill your inetd processes.

Listing 5: Running through the processes, and killing all inetds

use Proc::ProcessTable;

$t = new Proc::ProcessTable;

foreach $p (@{$t->table})

{

# note that we will also kill “xinetd” and all processes

# whose command line contains “inetd”

kill 9, $p->pid if $p->cmndline =~ ‘inetd’;

}

Summary

The most frustrating part of UNIX system administration is the variety of ways that UNIX vendors find to evade standardization. Because of this, Perl is powerless when it stands alone against all the issues in UNIX systems. Problems like the password file syntax, sharing file systems, and tracking logs quickly become unmanageable without a tool like cfengine. Nevertheless, some hope exists; after all, we just looked at some ways in which Perl can simplify system administration.

Perl interfaces quite well with cfengine. You could use Perl to produce custom-tailored cfengine configurations, or you could run Perl scripts from cfengine. I have done both, and find the integration to be painless. Alone, however, cfengine suffers from a simplistic configuration language and lack of data structures. I will expand upon this topic in a future article on cfengine.

The centralized configuration file strategy presented in this article should prove very useful if you choose to implement it. I have been using it on my site for six months now with great success. If you check the entire hierarchy into a version control system like CVS, you will also enjoy the benefit of versioned system files that can be reverted to any state that was checked into the version control system.

Resources

• Read Ted’s other Perl articles in the “Cultured Perl” series on developerWorks.

• Visit CPAN for all the Perl modules you ever wanted.

Perl.com has Perl information and related resources.

• Get the module cfengine, the all-in-one UNIX administration tool.

UNIX System Administration Handbook, 3rd Edition, by Evi Nemeth, Garth Snyder, Scott Seebass, and Trent R. Hein (Prentice Hall 2000), describes many different aspects of system administration, from basic topics to UNIX esoterica, and includes explicit coverage of four popular systems: Solaris 2.7, Red Hat Linux 6.2, HP-UX 11.00, and FreeBSD 3.4.

Programming Perl Third Edition, by Larry Wall, Tom Christiansen, and Jon Orwant (O’Reilly & Associates 2000) is the best guide to Perl today, up-to-date with 5.005 and 5.6.0.

Perl for System Administration: Managing multi-platform environments with Perl, by David N. Blank-Edelman (O’Reilly & Associates 2000) is a great summary of different tools and techniques written in Perl for system administration. Emphasis is placed on portability.

UNIX Power Tools, 2nd Edition , by Jerry Peek, Tim O’Reilly & Mike Loukides (O’Reilly & Associates 1997) is a great guide to getting started with UNIX shells and related tools. It’s a little dated, but still excellent.

• Visit O’Reilly & Associates: publishers of Programming Perl and many other fine books.

• For related articles by this author, see “Cultured Perl: One-liners 101” and “Cultured Perl: Perl 5.6 for C and Java programmers” ondeveloperWorks.

• Browse more Linux resources on developerWorks.

2010-05-26T16:58:20+00:00 July 15th, 2004|CGI and Perl|0 Comments

About the Author:

Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work, Perl, text parsing, three-tier client-server database architectures, and UNIX system administration. Contact Ted with suggestions and corrections at tzz@bu.edu.

Leave A Comment