Automating pip and virtualenv with shell scripts

pip and virtualenv are extremely useful tools to manage dependencies for a Python program. But they tend to not be used as often as they should be due to the cognitive overhead of having to use them properly whenever running and deploying the program. In this post I'll present a set of scripts I use to make this process as easy and automatic as possible.

Here's a sample directory structure I'll be referring to. If you choose a different structure, you'll need to adapt the scripts.

  • bin: script directory
  • ve: location of virtualenv
  • requirements.pip: list of dependencies in pip format

Here's the first automation script, bin/setup:

#!/bin/bash -e

BASEDIR=`dirname $0`/..

if [ ! -d "$BASEDIR/ve" ]; then
    virtualenv -q $BASEDIR/ve --no-site-packages
    echo "Virtualenv created."
fi

if [ ! -f "$BASEDIR/ve/updated" -o $BASEDIR/requirements.pip -nt $BASEDIR/ve/updated ]; then
    pip install -r $BASEDIR/requirements.pip -E $BASEDIR/ve
    touch $BASEDIR/ve/updated
    echo "Requirements installed."
fi

The script:

  • Checks that the virtualenv exists. If it doesn't, it automatically creates it.
  • Runs pip to install the dependencies from requirements.pip into the virtualenv. For maximum efficiency, each time the installation succeeds, the file ve/updated is touched. On subsequent runs, the script knows to skip installation unless the modification date of requirements.pip is more recent than that of requirements.pip.

The setup script makes managing the environment easier, but for the ultimate automation it can be prepended to another script used to actually run the program.

#!/bin/bash -e

BASEDIR=`dirname $0`/..

$BASEDIR/bin/setup

source $BASEDIR/ve/bin/activate
cd $BASEDIR
export PYTHONPATH=.

exec python $@

Whenever you'd run the Python interpreter, either on one of your modules or interactively, run this script instead. Arguments are forwarded directly to Python.

The script first runs the setup script to make sure the virtualenv is up to date. Because of the modification time checking, the overhead of doing this each time is negligible if the requirements haven't changed. It activates the virtualenv, eliminating the need to do so yourself in each terminal you open. It sets the working directory to the root of your project, so that relative paths work properly (though you really should use os.path commands instead in serious projects). And it adds the root of your project to the Python path, allowing absolute imports from your project from any module in a subpackage.

This setup is extremely convenient for development; each time you run your program it'll be run with up-to-date dependencies and in the right environment. It's also great for deployment to other machines because of the automation. The deploy process becomes nothing but a checkout and restart of the server process. However, if you have many dependencies, it may take several minutes to install them all, during which time your program isn't really running even though it's been started. In a production environment where this is not acceptable, an easy change is to run the setup script separately as part of the deploy process, so the run script has nothing to do before starting your program.

I now use this template for every nontrivial Python program I write.