Industrial Age Sysadminery:
cfengine Deployment and Configuration
|
John SJ Anderson
|
genehack.org
<genehack@genehack.org>
|
IS Mavens
(NCBI/NLM/NIH)
<john@ismavens.com>
|
http://genehack.org/talks/dclug_cfengine
21 Dec 2005
slide00
A Typical Sysadmin's Job
in a Typical Enterprise Environment
- Lots of hands-on work
- Lots of repetitive work
-
Installing, upgrading, or replacing a machine is a major effort,
for both admins and users
-
Lots of sequestered, undocumented domain knowledge (a/k/a "job
security")
- Lots of tiny machine-to-machine variations
-
Automation is mostly one-off {shell,Perl} scripts that are
hand-crafted per need, and that can't be maintained (or even
understood!) by anybody but the author.
It's no wonder we have a reputation as cranky misanthropes!
slide01
This is the same profile as a pre-Industrial Revolution manufacturing job
(Blacksmiths were pretty cranky too.)
What happened during the Industrial Revolution?
- Automation
- Standardization
- Greater integration
- Higher individual worker productivity
slide02
Why do we need to deal with this now?
Machine counts are increasing because of the success of the
"horizontal scaling" model of data center provisioning (a/k/a,
"throw another box on the fire").
-
Users seem to prefer lots of small boxes to one big one
-
The 1U, 2CPU server is the current "sweet spot" for x86 rack-mount
kit for the majority of uses
- Cluster-rama
-
Even machine virtualization doesn't help: you still have to manage
all the zones
How do admins deal with this?
-
Increase staffing (but this doesn't always help; see The
Mythical Man-Month for why)
- Increase automation
slide03
Requirements for successful large-scale automation
- Team buy in
- Management buy in
- Time to do the implementation
- Tools to automate common tasks (not, no just perl)
slide04
cfengine is a tool for
enterprise systems administration automation
Features:
- Cross-platform:
- WinNT/Win2K
- UNIX (Solaris, Linux, MacOS X, *BSD, AIX, etc.)
- IPv4 and IPv6 support (reasonably future-proof)
- Actively developed, with a helpful user community
- Licensed under the GPL
Philosophy:
- A "policy language" is used to describe desired system configurations
- Designed to "converge" systems toward a desired configuration
-
Description is updated and maintained in a central location, and
distributed everywhere, automatically. (This is a
critical distinction from Puppet.)
slide05
All cfengine actions are pull-based
Push-based automation causes as many problems as it solves
-
One (or more) of the targets is inevitably down at the time of the
push
- Mobile clients are never connected at the proper time
Push-based automation leads to system divergence
Pull-based automation doesn't have these problems.
slide06
cfengine places and parts
| Directory/Binary |
Function |
/var/cfengine |
Top level for all local cfengine files |
/var/cfengine/bin |
Binaries |
/var/cfengine/bin/cfagent |
the "autonomous agent" |
/var/cfengine/bin/cfenvd |
anomaly detection; entropy collection |
/var/cfengine/bin/cfexecd |
scheduling; reporting |
/var/cfengine/bin/cfkey |
generates public/private key pairs |
/var/cfengine/bin/cfrun |
triggers remote run of cfagent |
/var/cfengine/bin/cfservd |
file serving; remote execution |
/var/cfengine/inputs |
configuration files |
/var/cfengine/outputs |
logged output from cfagent runs |
/var/cfengine/ppkeys |
public/private RSA keys |
slide07
A "Basic" cfengine Deployment
slide08
A "Full-Blown" cfengine Deployment
slide09
Bootstrapping A "Basic" System
Two hosts: mendel, the management host,
and koch, a 'normal' host.
cfengine will manage configurations on both
machines.
Steps:
- Compile and/or install
cfengine on mendel
- Get
cfservd running on mendel
- Set up
update.conf on mendel
- Distribute software and
update.conf to koch
- Generate key pairs on
mendel and koch
- Set up
cfagent.conf on mendel
- Run
cfagent on mendel
- Run
cfagent on koch
slide10
Compile and install cfengine
on mendel
# wget ftp://ftp.iu.hio.no/pub/cfengine/cfengine-2.1.17.tar.gz
# tar xzvf cfengine-2.1.17.tar.gz
# cd cfengine-2.1.17
# ./configure
# make
# make install
-OR-
# apt-get install cfengine
-OR-
# emerge cfengine
(et cetera)
Requires BerkeleyDB and OpenSSL, which you most likely already have.
slide11
Get cfservd running on mendel
Edit /var/cfengine/inputs/cfservd.conf:
control:
domain = ( home.genehack.org )
IfElapsed = ( 1 )
MaxConnections = ( 10 )
AllowConnectionsFrom = ( 192.168.1 )
TrustKeysFrom = ( 192.168.1 )
AllowUsers = ( root )
LogAllConnections = ( true )
grant:
/opt/cfmaster *.home.genehack.org
Start cfservd:
# /etc/init.d/cfservd
-OR-
# /var/cfengine/bin/cfservd
slide12
Set up update.conf on mendel
First, the actual
/opt/cfmaster/var/cfengine/inputs/update.conf
file:
control:
actionsequence = ( copy tidy )
domain = ( home.genehack.org )
policyhost = ( mendel )
master = ( /opt/cfmaster )
local_dir = ( /var/cfengine )
# get a local copy of configuration and binaries
copy:
$(master)/$(local_dir)/inputs dest=$(local_dir)/inputs
r=inf
mode=700
type=checksum
exclude=*~
exclude=#*
exclude=.svn
server=$(policyhost)
trustkey=true
$(master)/$(local_dir)/bin dest=$(local_dir)/bin
r=1
mode=755
backup=false
type=checksum
server=$(policyhost)
# set up periodic cron job
$(master)/etc/cron.hourly/cfexecd dest=/etc/cron.hourly/cfexecd
mode=755
backup=false
type=checksum
server=$(policyhost)
# clean up old output
tidy:
$(local_dir)/outputs pattern=* age=7
Then set up the file repository:
# mkdir -p /opt/cfmaster/var/cfengine/{bin,inputs}
# mkdir -p /opt/cfmaster/etc/cron.hourly
# cp -p /var/cfagent/bin/* /opt/cfmaster/var/cfengine
Finally, set up the cron file stub at
/opt/cfmaster/etc/cron.hourly/cfexecd:
#!/bin/sh
/var/cfengine/bin/cfexecd -F
slide13
Distribute software and update.conf
to koch
# ssh root@koch mkdir -p /var/cfengine/{bin,inputs,outputs}
# scp /var/cfengine/bin/{cfagent,cfkey} koch:/var/cfengine/bin
# scp /var/cfengine/inputs/update.conf koch:/var/cfengine/inputs
Note that we're only distributing three files here: two binaries,
and one config file. Ideally, you'd actually want this to be part of
your base machine install.
slide14
Generate key pairs on mendel and koch
# /var/cfengine/bin/cfkey
Making a key pair for cfengine, please wait, this could take a minute...
Writing private key to /var/cfengine/ppkeys/localhost.priv
Writing public key to /var/cfengine/ppkeys/localhost.pub
# ssh koch /var/cfengine/bin/cfkey
Making a key pair for cfengine, please wait, this could take a minute...
Writing private key to /var/cfengine/ppkeys/localhost.priv
Writing public key to /var/cfengine/ppkeys/localhost.pub
Again, this should really happen during the initial machine install
(and if you use your distribution's copy of cfengine, it may.)
slide15
Set up cfagent.conf on mendel
Edit /opt/cfmaster/var/cfengine/inputs/cfagent.conf:
control:
domain = ( home.genehack.org )
timezone = ( EST )
smtpserver = ( localhost ) # used by cfexecd
sysadm = ( root@genehack.org ) # where to mail output
EmailMaxLines = ( inf )
policyhost = ( mendel )
master = ( /opt/cfmaster )
actionsequence = (
processes
copy
directories
files
links
tidy
shellcommands
)
processes:
any::
"lib/postfix/master"
restart "/etc/init.d/postfix start >/dev/null 2>&1"
mendel::
"cfservd"
restart "/var/cfengine/bin/cfservd"
directories:
any::
# postfix needs a certain set of permissions and ownerships
# in its spool dir:
/var/spool/postfix/active mode=0700 o=postfix g=root
/var/spool/postfix/bounce mode=0700 o=postfix g=root
/var/spool/postfix/corrupt mode=0700 o=postfix g=root
/var/spool/postfix/defer mode=0700 o=postfix g=root
/var/spool/postfix/deferred mode=0700 o=postfix g=root
/var/spool/postfix/flush mode=0700 o=postfix g=root
/var/spool/postfix/hold mode=0700 o=postfix g=root
/var/spool/postfix/incoming mode=0700 o=postfix g=root
/var/spool/postfix/maildrop mode=0730 o=postfix g=postdrop
/var/spool/postfix/pid mode=0755 o=root g=root
/var/spool/postfix/private mode=0700 o=postfix g=root
/var/spool/postfix/public mode=0710 o=postfix g=postdrop
files:
any::
# Check permissions on some important files
/etc/group mode=644 o=root g=root action=fixall
/etc/passwd mode=644 o=root g=root action=fixall
/etc/shadow mode=600 o=root g=root action=fixall
tidy:
any::
/tmp pat=* age=14 r=inf rmdirs=true inform=false
/tmp pat=core* age=0 r=inf inform=true
slide16
Initial cfagent run on mendel
(Lightly trimmed for space reasons)
GNU Configuration Engine -
2.1.17
Free Software Foundation 1994-
Donated by Mark Burgess, Faculty of Engineering,
Oslo University College, 0254 Oslo, Norway
------------------------------------------------------------------------
Host name is: mendel
Operating System Type is linux
Operating System Release is 2.6.14-gentoo
Architecture = i686
The time is now Tue Dec 20 21:21:38 2005
------------------------------------------------------------------------
Additional hard class defined as: 32_bit
Additional hard class defined as: linux_2_6_14_gentoo
Additional hard class defined as: linux_i686
Additional hard class defined as: linux_i686_2_6_14_gentoo
GNU autoconf class from compile time: compiled_on_linux_gnu
Address given by nameserver: 192.168.1.10
Interface 1: eth0
Interface 2: lo
Trying to locate my IPv6 address
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
* (Changing context state to: update) *
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Looking for an input file /var/cfengine/inputs/update.conf
Finished with update.conf
LogDirectory = /var/cfengine
Loaded /var/cfengine/ppkeys/localhost.priv
Loaded /var/cfengine/ppkeys/localhost.pub
Checksum database is /var/cfengine/checksum.db
*********************************************************************
Update Sched: copy pass 1 @ Tue Dec 20 21:21:38 2005
*********************************************************************
Checking copy from localhost:/opt/cfmaster/var/cfengine/inputs to
/var/cfengine/inputs
cfengine:: /var/cfengine/inputs/cfagent.conf wasn't at destination (copying)
cfengine:: Copying from
localhost:/opt/cfmaster/var/cfengine/inputs/cfagent.conf
cfengine:: Object /var/cfengine/inputs/cfagent.conf had permission
600, changed it to 700
Checking copy from localhost:/opt/cfmaster/var/cfengine/bin to
/var/cfengine/bin
Checking copy from localhost:/opt/cfmaster/etc/cron.hourly/cfexecd to
/etc/cron.hourly/cfexecd
cfengine:: /etc/cron.hourly/cfexecd wasn't at destination (copying)
cfengine:: Copying from localhost:/opt/cfmaster/etc/cron.hourly/cfexecd
cfengine:: Object /etc/cron.hourly/cfexecd had permission 600, changed
it to 755
*********************************************************************
Update Sched: tidy pass 1 @ Tue Dec 20 21:21:38 2005
*********************************************************************
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
* (Changing context state to: main) *
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Looking for an input file /var/cfengine/inputs/cfagent.conf
Finished with cfagent.conf
Accepted domain name: home.genehack.org
Defined Classes = ( 192_168_1 192_168_1_10 32_bit Day20 December Hr21
Hr21_Q2 Min20_25 Min21 Q2 Tuesday Yr2005 any cfengine_2 cfengine_2_1
cfengine_2_1_17 compiled_on_linux_gnu i686 ipv4_192 ipv4_192_168
ipv4_192_168_1 ipv4_192_168_1_10 linux linux_2_6_14_gentoo linux_i686
linux_i686_2_6_14_gentoo
linux_i686_2_6_14_gentoo__1_PREEMPT_Sun_Oct_30_15_22_10_EST_2005
mendel mendel_home_genehack_org net_iface_eth0 net_iface_lo )
Reference time set to Tue Dec 20 21:21:38 2005
*********************************************************************
Main Tree Sched: processes pass 1 @ Tue Dec 20 21:21:38 2005
*********************************************************************
cfengine:mendel: Running process command /bin/ps auxw
Defining classes
DoSignals(lib/postfix/master)
Existing restart sequence found (/etc/init.d/postfix start >/dev/null 2>&1)
cfengine:mendel: Matches found for lib/postfix/master - no restart sequence
Defining classes
DoSignals(cfservd)
Existing restart sequence found (/var/cfengine/bin/cfservd)
cfengine:mendel: Matches found for cfservd - no restart sequence
*********************************************************************
Main Tree Sched: copy pass 1 @ Tue Dec 20 21:21:38 2005
*********************************************************************
*********************************************************************
Main Tree Sched: directories pass 1 @ Tue Dec 20 21:21:38 2005
*********************************************************************
MakePath(/var/spool/postfix/active)
MakePath(/var/spool/postfix/bounce)
MakePath(/var/spool/postfix/corrupt)
MakePath(/var/spool/postfix/defer)
MakePath(/var/spool/postfix/deferred)
MakePath(/var/spool/postfix/flush)
MakePath(/var/spool/postfix/hold)
MakePath(/var/spool/postfix/incoming)
MakePath(/var/spool/postfix/maildrop)
MakePath(/var/spool/postfix/pid)
MakePath(/var/spool/postfix/private)
MakePath(/var/spool/postfix/public)
*********************************************************************
Main Tree Sched: files pass 1 @ Tue Dec 20 21:21:38 2005
*********************************************************************
Checking file(s) in /etc/group
Checking file(s) in /etc/passwd
Checking file(s) in /etc/shadow
*********************************************************************
Main Tree Sched: links pass 1 @ Tue Dec 20 21:21:38 2005
*********************************************************************
*********************************************************************
Main Tree Sched: tidy pass 1 @ Tue Dec 20 21:21:38 2005
*********************************************************************
*********************************************************************
Main Tree Sched: shellcommands pass 1 @ Tue Dec 20 21:21:38 2005
*********************************************************************
slide17
Initial cfagent run on koch
(Again, lightly trimmed for space reasons)
GNU Configuration Engine -
2.1.17
Free Software Foundation 1994-
Donated by Mark Burgess, Faculty of Engineering,
Oslo University College, 0254 Oslo, Norway
------------------------------------------------------------------------
Host name is: koch
Operating System Type is linux
Operating System Release is 2.6.14-gentoo
Architecture = i686
The time is now Tue Dec 20 21:24:03 2005
------------------------------------------------------------------------
Additional hard class defined as: 32_bit
Additional hard class defined as: linux_2_6_14_gentoo
Additional hard class defined as: linux_i686
Additional hard class defined as: linux_i686_2_6_14_gentoo
GNU autoconf class from compile time: compiled_on_linux_gnu
Address given by nameserver: 192.168.1.11
Interface 1: lo
Interface 2: eth0
Trying to locate my IPv6 address
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
* (Changing context state to: update) *
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Looking for an input file /var/cfengine/inputs/update.conf
Finished with update.conf
LogDirectory = /var/cfengine
Loaded /var/cfengine/ppkeys/localhost.priv
Loaded /var/cfengine/ppkeys/localhost.pub
Checksum database is /var/cfengine/checksum.db
*********************************************************************
Update Sched: copy pass 1 @ Tue Dec 20 21:24:03 2005
*********************************************************************
Checking copy from mendel:/opt/cfmaster/var/cfengine/inputs to
/var/cfengine/inputs
Connect to mendel = 192.168.1.10 on port 5308
cfengine:: Trusting server identity and willing to accept key from
mendel=192.168.1.10
Saving public key /var/cfengine/ppkeys/root-192.168.1.10.pub
cfengine:: /var/cfengine/inputs/cfagent.conf wasn't at destination (copying)
cfengine:: Copying from mendel:/opt/cfmaster/var/cfengine/inputs/cfagent.conf
cfengine:: Object /var/cfengine/inputs/cfagent.conf had permission
600, changed it to 700
Checking copy from mendel:/opt/cfmaster/var/cfengine/bin to
/var/cfengine/bin
Checking copy from mendel:/opt/cfmaster/etc/cron.hourly/cfexecd to
/etc/cron.hourly/cfexecd
cfengine:: /etc/cron.hourly/cfexecd wasn't at destination (copying)
cfengine:: Copying from mendel:/opt/cfmaster/etc/cron.hourly/cfexecd
cfengine:: Object /etc/cron.hourly/cfexecd had permission 600, changed
it to 755
*********************************************************************
Update Sched: tidy pass 1 @ Tue Dec 20 21:24:03 2005
*********************************************************************
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
* (Changing context state to: main) *
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Looking for an input file /var/cfengine/inputs/cfagent.conf
Finished with cfagent.conf
Accepted domain name: home.genehack.org
Defined Classes = ( 192_168_1 192_168_1_11 32_bit Day20 December Hr21
Hr21_Q2 Min20_25 Min24 Q2 Tuesday Yr2005 any cfengine_2 cfengine_2_1
cfengine_2_1_17 compiled_on_linux_gnu i686 ipv4_192 ipv4_192_168
ipv4_192_168_1 ipv4_192_168_1_11 koch koch_home_genehack_org linux
linux_2_6_14_gentoo linux_i686 linux_i686_2_6_14_gentoo
linux_i686_2_6_14_gentoo__1_Sun_Oct_30_11_36_26_EST_2005
net_iface_eth0 net_iface_lo )
Reference time set to Tue Dec 20 21:24:03 2005
*********************************************************************
Main Tree Sched: processes pass 1 @ Tue Dec 20 21:24:03 2005
*********************************************************************
cfengine:koch: Running process command /bin/ps auxw
Defining classes
DoSignals(lib/postfix/master)
Existing restart sequence found (/etc/init.d/postfix start >/dev/null 2>&1)
cfengine:koch: Matches found for lib/postfix/master - no restart sequence
*********************************************************************
Main Tree Sched: copy pass 1 @ Tue Dec 20 21:24:04 2005
*********************************************************************
*********************************************************************
Main Tree Sched: directories pass 1 @ Tue Dec 20 21:24:04 2005
*********************************************************************
MakePath(/var/spool/postfix/active)
MakePath(/var/spool/postfix/bounce)
MakePath(/var/spool/postfix/corrupt)
MakePath(/var/spool/postfix/defer)
MakePath(/var/spool/postfix/deferred)
MakePath(/var/spool/postfix/flush)
MakePath(/var/spool/postfix/hold)
MakePath(/var/spool/postfix/incoming)
MakePath(/var/spool/postfix/maildrop)
MakePath(/var/spool/postfix/pid)
MakePath(/var/spool/postfix/private)
MakePath(/var/spool/postfix/public)
*********************************************************************
Main Tree Sched: files pass 1 @ Tue Dec 20 21:24:04 2005
*********************************************************************
Checking file(s) in /etc/group
Checking file(s) in /etc/passwd
Checking file(s) in /etc/shadow
*********************************************************************
Main Tree Sched: links pass 1 @ Tue Dec 20 21:24:04 2005
*********************************************************************
*********************************************************************
Main Tree Sched: tidy pass 1 @ Tue Dec 20 21:24:04 2005
*********************************************************************
*********************************************************************
Main Tree Sched: shellcommands pass 1 @ Tue Dec 20 21:24:33 2005
*********************************************************************
slide18
Things To Note
-
Process is repetitive -- literally two runs of the exact same
process, with different configs
-
Even if
cfagent.conf is totally broken, the
re-bootstrapping that happens with update.conf on
each run means that you can get things working again by fixing the
copy of cfagent.conf in the master repository and
waiting for the next update cycle
-
All actions are designed to move towards desired
configuration
-
Can repeat actions as often as desired, and configuration will
still be converged to the desired point
slide19
Practical Considerations
- Splitting up
cfagent.conf
- Organizing files under
/opt/cfengine
- Version control (CVS, SVN, etc.) for
/opt/cfengine
-
Packaging the
cfengine software -- ideally,
directory setup, key generation, and
initial cfagent run will happen as part of initial
image deployment
slide20
One potential /opt/cfengine configuration
/opt/cfengine/bin
/opt/cfengine/bin/update_master
/opt/cfengine/GENERIC/etc/motd
/opt/cfengine/SPECIFIC/etc/fstab/host1
/opt/cfengine/SPECIFIC/etc/fstab/host2
slide21
Other things you can do with cfengine
mountall - mount filesystems in fstab
mountinfo - scan mounted filesystems
checktimezone - check timezone
netconfig - check net interface config
resolve - check resolver setup
unmount - unmount filesystems
mailcheck - check mailserver
editfiles - edit files
addmounts - add new filesystems
required - check required filesystems
disable - disable files
module:name - execute a user-defined module
slide22
A final word about groups:
Possibly the coolest thing about cfengine, and
the major way of conditionalizing actions
Examples:
science = ( saga tor odin )
notthis = ( !this )
our_nonrouted_class_c = ( IPRange( 192.1.168/24 ))
another_nonrouted_range = ( IPRange( 192.0.168.1-20 ))
pulled_from_nis_netgroup = ( +nis-netgroup-name )
nis_netgroup_minus_one = ( +nis-netgroup -onehost )
have_apache = ( '/bin/test -f /usr/sbin/apache' )
Groups can be dynamically defined, conditional on the result of an
action. This lets you do classic 'if-then' logic to (for example)
trigger subsequent actions based on file copies:
control:
actionsequence = ( copy shellcommands )
copy:
mail_srvr::
$(master)/MAIL/etc/aliases
dest=/etc/aliases
type=checksum
backup=false
define=run_newaliases
server=$(policyhost)
shellcommands:
run_newaliases::
"/usr/bin/newaliases"
Classes can also be combined with Boolean operators, and then
actions will only "fire" when the whole combination evaluates to
true:
this|that::
this.that::
(this|that).the_other::
slide23
Where to go from here - a very partial resource list
cfengine.org
cfwiki.org
freenode.net#cfengine
- Automating UNIX and Linux Administration - Kirk Bauer
- Essential System Administration - Æleen Frisch
-
infrastructures.org -- particularly the
"Bootstrapping An Infrastructure" paper at
http://www.infrastructures.org/papers/bootstrap/bootstrap.html
http://www.isconf.org/
http://www.lcfg.org/
slide24