New WordPress Theme: codegeek April 14th, 2012

As you may have noticed, I’ve replaced the old theme here on my blog with something a little cleaner and simpler. I couldn’t really find anything I liked or wanted on the net, so I threw something simple together using Twitters excellent Bootstrap library along with a little help from Bootswatch.

If you’d like a copy to use or adapt, you can grab it from https://github.com/gregarmer/wp-codegeek. Beware though, it is by no means a complete WordPress theme and only handles my specific requirements at this point.

Replacing Django’s Nasty “runserver” December 6th, 2011

Have you ever tried to have more than one person view a development site using Django’s built-in development server ? Yeah, it really sucks. Apparently concurrency wasn’t high on the features list and they have stated that it never will be.

DO NOT USE THIS SERVER IN A PRODUCTION SETTING. It has not gone through security audits or performance tests. (And that’s how it’s gonna stay. We’re in the business of making Web frameworks, not Web servers, so improving this server to be able to handle a production environment is outside the scope of Django.)

So how do we go about using something a little nicer without losing any of the auto-reload goodness and without having to setup a full blown production environment ?

There are a number of alternatives, however I’ve selected Twisted Web simply because I really like the twisted framework and due to the experience I have in using it, I am very comfortable with it. It’s a great feature-packed web server that handles concurrency (and a ton of other things) exceptionally well.

So how do we use it to serve our little Django project in a development friendly way ?

I’ve put together some code (some borrowed from other sources) and constructed a simple replacement command called “trunserver” (twisted-runserver). You can grab this code from Github. Simply install it using the standard methods, and run it with:

python manage.py trunserver [IP:PORT] [--settings=foo] [--noreload]

So this will start up a twisted web instance serving your Django project and just like the build-in runserver, it will automatically reload your code when it notices that your files have been modified unless –noreload has been passed.

There are a few things missing at this point, like IPv6 support and static file serving, however these are on the roadmap.

I’ll post again with some more info once it is a little more stable and an official release has been provided.

The impact of being behind schedule February 11th, 2011

In managing a group of software engineers, this is something that has happened frequently in my team and has been bothering me for a while. It’s a lot easier for me to notice, as in my case, I actively write software with my team.

The problem

The entire team tends to perform so much better when we’re ahead of schedule, our spirits are high, everyone is motivated, the SCRUM board is bouncing around actively and everything is going great. However, as soon as the pressure starts to increase, a few milestones are missed and things start falling a little behind schedule. The entire team rapidly starts losing hope, everyone appears lethargic, demotivation kicks in and things slowly start grinding to a halt.

So how do we stop this ?

In trying to curb this level of demotivation and fatigue, we first need to understand why this happens. In reality, being a bit behind schedule is really not the end of the world. Estimates are provided on project milestones, but we need to realize that they are called estimates for a reason. No matter how many proven processes your software engineering team has in place and how good you have become at determining your teams velocity, there will always be parts of a project that cannot be put into a little box with a start and end date.

In addition to that, even though your estimates may be quite realistic, you can never accurately gauge what other problems may come along during a sprint. In our environment, we often have “urgent” requests to deal with; bugs, emergency maintenance, and other pesky time-wasters. To the management suits upstairs, these may seem inconsequential but in my experience, they have a far greater reaching impact than the suits realize.

All of this unexpected work contributes to pushing the team behind schedule. Most times we can catch up without impacting the projects final delivery, but there are rare times where we fall further and further behind schedule. It is these times that the team seems to get stuck in this cycle of despair and their relative output is reduced to who shouts at them the loudest.

So far, I have not found a good way to reverse this mindset after it has happened. The best way to work around this problem, in my humble opinion, is to not get there in the first place. Software engineers, sales teams and clients must realize that deadlines are going to be missed, specs are not always accurate and all kinds of impediments are going to get in the way of delivering quality work on time. The best thing we can do to prevent this is to manage everyones expectations in the best way possible.

Keeping everyone happy

Communication is key in managing the expectations of everyone involved. It is a lot easier to keep everyone happy when they know upfront that the team is falling behind schedule. The pressure from clients is reduced when they are informed early that an expected date of delivery is unlikely to be hit, which in turn reduces the amount of pressure. This contributes greatly to keeping the workforce in high spirits, amidst the whooshing sound of missed milestones flying by, and lets them stay motivated and productive.

Increasing the amount of pressure really does nothing to help a project along, although this is often the only solution that the clients and non-developers can think up. In fact, I strongly believe it does just the opposite of what it was intended to do. Adding pressure to an already drowning team only culls whatever motivation there was still remaining. This leads to developers lying about the status of a project in a desperate attempt to alleviate that pressure. That inaccurate status gets communicated back to the stake holders and the cycle just begins over – except with more pressure as the team is now even further behind.

In conclusion

Software engineers, be honest and accurate about your actual status, it may not seem like it, but you’re only going to help yourselves in the long run. Suits, be nicer to your workforce, they’re doing the best they can. Adding more pressure is helping no-one.

That is all.

The day the routers died… February 3rd, 2011

On February 3, 1959, Buddy Holly, Richie Valens and JP Richardson (aka The Big Bopper) died in a plane crash. Don McLean immortalized that day as “The Day The Music Died” in his 1971 hit, “American Pie”.

It’s somewhat ironic that on February 3, 2010 the last five /8s from the IANA IPv4 pool have been distributed to the RIRs.

102/8   AfriNIC    2011-02    whois.afrinic.net ALLOCATED
103/8   APNIC      2011-02    whois.apnic.net   ALLOCATED
104/8   ARIN       2011-02    whois.arin.net    ALLOCATED
179/8   LACNIC     2011-02    whois.lacnic.net  ALLOCATED
185/8   RIPE NCC   2011-02    whois.ripe.net    ALLOCATED

During a RIPE55 meeting surrounding IPv4 exhaustion, this rephrased version of that 1971 hit was played:

Puppet Modules – Debsecan January 2nd, 2011

This is the first post of (hopefully) many, detailing some of my Puppet module implementations. Being the first, I thought I would start off with something simple.

Debsecan
The debsecan program evaluates the security status of a host running the Debian operation system. It reports missing security updates and known vulnerabilities in the programs which are installed on the host.

This is a great package that I wanted installed on all Debian machines across my entire infrastructure. Thanks to Puppet, this is a breeze.

Module layout

greg@codemine:~/code/puppet %> find modules/debsecan
modules/debsecan
modules/debsecan/files
modules/debsecan/files/debsecan
modules/debsecan/files/debsecan-cron
modules/debsecan/manifests
modules/debsecan/manifests/init.pp

Manifest – init.pp

greg@codemine:~/code/puppet %> cat modules/debsecan/manifests/init.pp
class debsecan {
    package { debsecan: ensure => latest }

    file {
        debsecan:
            path => "/etc/default/debsecan",
            owner => root,
            group => "root",
            mode => 644,
            source  => "puppet:///debsecan/debsecan",
            require => Package["debsecan"];
        debsecan-cron:
            path => "/etc/cron.d/debsecan",
            owner => root,
            group => "root",
            mode => 644,
            source  => "puppet:///debsecan/debsecan-cron",
            require => Package["debsecan"];
    }
}

There is really not much to this manifest. It essentially ensures debsecan is installed at the latest available version, it sets up my /etc/default/debsecan config and it ensures there is a cron entry to run it.

Debsecan config

greg@codemine:~/code/puppet %> cat modules/debsecan/files/debsecan
# Configuration file for debsecan.  Contents of this file should
# adhere to the KEY=VALUE shell syntax.  This file may be edited by
# debsecan's scripts, but your modifications are preserved.

# If true, enable daily reports, sent by email.
REPORT=true

# For better reporting, specify the correct suite here, using the code
# name (that is, "sid" instead of "unstable").
SUITE=lenny

# Mail address to which reports are sent.
MAILTO=root

# The URL from which vulnerability data is downloaded.  Empty for the
# built-in default.
SOURCE=

Debsecan cron

greg@codemine:~/code/puppet %> cat modules/debsecan/files/debsecan-cron
# cron entry for debsecan
MAILTO=root

42 * * * * daemon test -x /usr/bin/debsecan && /usr/bin/debsecan --cron
# (Note: debsecan delays actual processing past 2:00 AM, and runs only
# once per day.)

You can grab a copy of all the above files (the complete module) here: debsecan-puppet.tar.gz

Using ferm to build firewall rulesets December 31st, 2010

This post is thanks to a suggestion from JP Viljoen to check out ferm. Well, I did, and it’s fairly neat. You get to express your firewall configuration in structures resembling simple C code along with using things like arrays, functions and if / else constructs which makes building complex rulesets quite a simple task.

I’ve included an example configuration below of one of my machines. The network configuration is not extremely complex, but there is a mix of IPv4, IPv6 and – as this is an IRC server – some DNAT to make the IRC service available on a number of other privileged ports without having the service actually listen on those ports. This particular server is running Debian however ferm is basically just a front to ip(6)tables so it’ll run pretty much anywhere that runs.

First off, here is my network interface configuration to give you an idea of what is where:

kore:~# cat /etc/network/interfaces 

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
    address 173.134.21.19             # Static eth0 IP
    netmask 255.255.255.0
    gateway 173.134.21.1

iface eth0 inet6 static
    address 2001:410:1e9b:ba22::2     # Primary HE.net IPv6 /64 address
    netmask 64

auto eth0:0
iface eth0:0 inet static
    address 192.168.49.97             # Local networking
    netmask 255.255.128.0

auto he-ipv6
iface he-ipv6 inet6 v4tunnel
    address 2001:410:1e9a:ba22::2     # Tunnel address
    netmask 64
    ttl 255
    gateway 2001:410:1e9a:ba22::1
    endpoint 216.218.224.42
    local 173.134.21.19

There is nothing extremely complicated here, just a basic IPv4 static IP assigned by my provider, a local network for traffic between this and other local nodes, a Hurricane Electric IPv6 tunnel and a static IP from my HE.net provided /64.

The ferm configuration in use here looks like this:

kore:~# cat /etc/ferm/ferm.conf
# -*- shell-script -*-
#
#  Configuration file for ferm(1).
#

@def $PORTS = (22 25 161 4949 6667 6668 7000 7352 7535); # Services running
@def $IRC_PORTS = (21 23 53 80 110 143 993);             # Additional ports

table filter {
    chain INPUT {
        policy DROP;

        # connection tracking
        mod state state INVALID DROP;
        mod state state (ESTABLISHED RELATED) ACCEPT;

        # allow local packages
        interface lo ACCEPT;

        # respond to ping
        proto icmp ACCEPT; 

        # standard ports we allow from the outside
        proto tcp dport $PORTS ACCEPT;
    }

    chain OUTPUT {
        policy ACCEPT;

        # connection tracking
        #mod state state INVALID DROP;
        mod state state (ESTABLISHED RELATED) ACCEPT;
    }

    chain FORWARD {
        policy DROP;

        # connection tracking
        mod state state INVALID DROP;
        mod state state (ESTABLISHED RELATED) ACCEPT;
    }
}

table nat {
    chain PREROUTING {
        # additional ports we listen on and redirect to the IRC server
        interface eth0 proto tcp dport $IRC_PORTS DNAT to 173.134.21.19:6667;
    }
}

# IPv6:
domain ip6 table filter {
    chain INPUT {
        policy DROP;

        # connection tracking
        mod state state INVALID DROP;
        mod state state (ESTABLISHED RELATED) ACCEPT;

        # allow ICMP (for neighbor solicitation, like ARP for IPv4)
        proto ipv6-icmp ACCEPT;

        # standard ports we allow from the outside
        proto tcp dport $PORTS ACCEPT;
    }

    chain OUTPUT {
        policy ACCEPT;

        # connection tracking
        #mod state state INVALID DROP;
        mod state state (ESTABLISHED RELATED) ACCEPT;
    }

    chain FORWARD {
        policy DROP;

        # connection tracking
        mod state state INVALID DROP;
        mod state state (ESTABLISHED RELATED) ACCEPT;
    }
}

So this ruleset is basically broken down into 3 parts:

  • IPv4 filter table
  • IPv4 nat table
  • IPv6 filter table

IPv4 filter table
We control the INPUT, OUTPUT and FORWARD chains here. On the INPUT chain, we default to dropping everything, enable connection state tracking, allow all traffic through our local interface, allow ICMP and specify a list of ports we allow the outside world to use. On the OUTPUT chain we allow everything out and enable connection state tracking. Finally on the FORWARD chain we drop everything as this machine is not a router. Pretty concise right ?

IPv4 nat table
In the nat table config, we basically setup the DNAT of those privileged ports under the PREROUTING chain.

IPv6 filter table
Finally, in the IPv6 filter table, we allow the same set of incoming ports as IPv4, allow ipv6-icmp and setup connection state tracking as before.

Once that’s done, simply run:

kore:~# ferm /etc/ferm/ferm.conf

… and your new ruleset is validated and loaded.

On a side note, if you are interested in playing around with IPv6 I would highly recommend setting up a Hurricane Electric tunnel and then doing the certification. It makes for a great Saturday afternoon time waster and you might learn something along the way:
IPv6 Certification

Natural order sorting strings with numbers September 23rd, 2010

The following python code makes natural sorting sequences of lexical and numerical values a little easier. It supports any iterable containing strings which have embedded numbers. In short it would give you this:

foo1 < foo2 < foo10

instead of this:

foo1 < foo10 < foo2

As an example, if you have this sequence:

>>> seq = ['foo', 'foo1', 'foo2', 'foo10', 'foobar10', '20', '100', '1', '3', 'bar1']

a regular sort would produce this:

>>> sorted(seq)
['1', '100', '20', '3', 'bar1', 'foo', 'foo1', 'foo10', 'foo2', 'foobar10']

whereas a natural sort would produce this:

>>> natural_sort(seq)
['1', '3', '20', '100', 'bar1', 'foo', 'foo1', 'foo2', 'foo10', 'foobar10']

Here is the code:

import re

def natsort_key(item):
    chunks = re.split('(\d+(?:\.\d+)?)', item)
    for ii in range(len(chunks)):
        if chunks[ii] and chunks[ii][0] in '0123456789':
            if '.' in chunks[ii]: numtype = float
            else: numtype = int
            chunks[ii] = (0, numtype(chunks[ii]))
        else:
            chunks[ii] = (1, chunks[ii])
    return (chunks, item)

def natural_sort(seq):
    sortlist = [item for item in seq]
    sortlist.sort(key=natsort_key)
    return sortlist

Generating a dependency graph for a PostgreSQL database July 9th, 2010

This post was mostly inspired by this one, which shows how to generate a dependency graph for a MySQL database. Here we do something similar for PostgreSQL.

This script will generate the required digraph data to pipe into graphviz dot which will generate a visual representation of dependencies in a database schema, based on foreign key constraints.

The script:


from optparse import OptionParser, OptionGroup

import psycopg2
import sys

def writedeps(cursor, tbl):
    sql = """SELECT
        tc.constraint_name, tc.table_name, kcu.column_name,
        ccu.table_name AS foreign_table_name,
        ccu.column_name AS foreign_column_name
    FROM
        information_schema.table_constraints AS tc
    JOIN information_schema.key_column_usage AS kcu ON
        tc.constraint_name = kcu.constraint_name
    JOIN information_schema.constraint_column_usage AS ccu ON
        ccu.constraint_name = tc.constraint_name
    WHERE constraint_type = 'FOREIGN KEY' AND tc.table_name = '%s'"""
    cursor.execute(sql % tbl)
    for row in cursor.fetchall():
        constraint, table, column, foreign_table, foreign_column = row
        print '"%s" -> "%s" [label="%s"];' % (tbl, foreign_table, constraint)

def get_tables(cursor):
    cursor.execute("SELECT tablename FROM pg_tables WHERE schemaname='public'")
    for row in cursor.fetchall():
        yield row[0]

def main():
    parser = OptionParser()

    group = OptionGroup(parser, "Database Options")
    group.add_option("--dbname", action="store", dest="dbname",
            help="The database name.")
    group.add_option("--dbhost", action="store", dest="dbhost",
            default="localhost",  help="The database host.")
    group.add_option("--dbuser", action="store", dest="dbuser",
            help="The database username.")
    group.add_option("--dbpass", action="store", dest="dbpass",
            help="The database password.")
    parser.add_option_group(group)

    (options, args) = parser.parse_args()

    if not options.dbname:
        print "Please supply a database name, see --help for more info."
        sys.exit(1)

    try:
        conn = psycopg2.connect("dbname='%s' user='%s' host='%s' password='%s'"
            % (options.dbname, options.dbuser, options.dbhost, options.dbpass))
    except psycopg2.OperationalError, e:
        print "Failed to connect to database,",
        print "perhaps you need to supply auth details:\n %s" % str(e)
        print "Use --help for more info."
        sys.exit(1)

    cursor = conn.cursor()

    print "Digraph F {\n"
    print 'ranksep=1.0; size="18.5, 15.5"; rankdir=LR;'
    for i in get_tables(cursor):
        writedeps(cursor, i)
    print "}"

    sys.exit(0)

if __name__ == "__main__":
    main()

You could run it as follows:

python postgres-deps.py --dbname some_database | dot -Tpng > deps.png

Note: for other options use:

python postgres-deps.py --help

That should spit out one of these:

deps

Getting Git manpages on OS X April 15th, 2010

For some reason the OS X install of Git doesn’t include the manpages. Here is how I installed them.

First off, find the appropriate manpath.

greg@codemine:~ %> cat /etc/manpaths
/usr/share/man
/usr/local/share/man

/usr/local/share/man looks good…

greg@codemine:~ %> VER=`git --version | awk '{print $3}'`
greg@codemine:~ %> curl -O http://www.kernel.org/pub/software/scm/git/git-manpages-$VER.tar.bz2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  242k  100  242k    0     0  92051      0  0:00:02  0:00:02 --:--:--   99k
greg@codemine:~ %> sudo tar xjv -C /usr/local/share/man -f git-manpages-$VER.tar.bz2
Password:
x ./
x ./man1/
x ./man1/git-add.1
[snip]
x ./man7/gitworkflows.7
greg@codemine:~ %> rm git-manpages-$VER.tar.bz2
greg@codemine:~ %>

“man git-add” should now work fine.

Extending Python with modules written in C December 26th, 2009

Using C (or C++) to create Python modules is really quite simple, providing you know a little C of course. I recently had to do some work around getting a bunch of legacy C code talking to a newer system and thought I’d post a nice simple example of how the Python extensions work.

This code gives you a single method “do()” that will print the output of a command, passed to it as a string, to stdout and return the exit code as a python int.

Dump this into “mycmd.c”:

#include <Python.h>

static PyObject * mycmd_do(PyObject *self, PyObject *args) {
    const char *command;
    int sts;

    if (!PyArg_ParseTuple(args, "s", &command))
        return NULL;
    sts = system(command);
    return Py_BuildValue("i", sts);
}

static PyMethodDef MyCmdMethods[] = {
    {"do", mycmd_do, METH_VARARGS, "Print output of 'cmd', return exit code."},
    {NULL, NULL, 0, NULL}        /* Sentinel */
};

PyMODINIT_FUNC
initmycmd(void) {
    (void) Py_InitModule("mycmd", MyCmdMethods);
}

int main(int argc, char *argv[]) {
    Py_SetProgramName(argv[0]);
    Py_Initialize();
    initmycmd();
    return 0;
}

Great, so we have some example code now, here is how you build an importable module with it:

greg@codemine:~/code/mycmd %> cc -dynamic -g -Wall -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c mycmd.c -o mycmd.o
greg@codemine:~/code/mycmd %> cc -bundle -undefined dynamic_lookup mycmd.o -o mycmd.so

Note: Don’t forget to replace the include path above with the correct path to Python.h on your machine.

This should give you a mycmd.so on unix / linux and a mycmd.dll on windows. In the same directory, run a python interpreter and test it out.

greg@codemine:~/code/mycmd %> python
Python 2.6.3 (r263:75183, Nov  4 2009, 12:53:19)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import mycmd
>>> mycmd.do('/usr/bin/false')
256
>>> mycmd.do('/usr/bin/true')
0
>>> mycmd.do('uname -a')
Darwin codemine.codelounge.int 10.2.0 Darwin Kernel Version 10.2.0: Tue Nov  3 10:37:10 PST 2009; root:xnu-1486.2.11~1/RELEASE_I386 i386
0
>>>

There is much more you can do around this, thankfully the documentation is remarkably good.

There is not much to the actual code. First, we define the C function that will handle our command “mycmd_do”. Then we set up an array of methods we want to expose to python “MyCmdMethods”. We then setup an initializer “initmycmd” to expose the module which is executed from “main” after the python initializer “Py_Initialize”.