Static site migration - how I automated the static Hugo build and deployments for the blog

March 13, 2021

Timo Geusch

5-Minute Read

Good programmers are supposed to be lazy, right? The way I interpret this statement - because none of the software engineers who I know could be considered lazy - is that we like to automate repetitive tasks. You know, tasks like checking if you’ve made any changes to your blog and then building the blog and deploying the changes automatically. Which is what I’ve done, and in this post I’ll show you my minimalist setup to do so.

In the spirit of keeping everything off other people’s platforms, the blog content is managed in a Mercurial repository on my home server. This repo holds all of the artifacts - blog configuration, Markdown/Org files for the posts, images, robots.txt etc, other than the comments and the Hugo theme.

The comments are stored in a SQLite database on the web server itself that is also backed up to my home server regularly. That’s a simple rsync though and not included in the build script. The build and deployment scripts really only focus on getting the blog onto the web server and not on getting data back onto the home server.

The blog build and deployment process can be broken down into the following steps:

Update the deployment checkout from my local Mercurial repository
Run Hugo in release build mode to generate the deployment artifacts and apply some minification
Adjust some of the category feeds. Hugo puts the category feeds into /categories/<category name>/index.xml, whereas I prefer to have them in /categories/<category name>/feed/. The build script makes those adjustments and makes the files available in both places. There’s also the slight embarrassment that I hadn’t noticed that WordPress uses /category instead of /categories. So much for “making my feeds look like WordPress”. Oops. Turns out that based on the NGINX error log, the only category feed in use is the Emacs one, which I believe is used by Planet Emacslife. So for now the quickest way out of this dilemma was to duplicate the /categories/emacs/ feed into /category/emacs/feed and adjusting the link in the index.xml. sed is clearly the correct tool to edit XML, right?. Hey, it works on my machine.
Pre-compress the generated artifacts of certain types (HTML, CSS, JS, JSON). My NGINX server that hosts the site is configured with gzip_static: on; so for the document types that it is supposed to compress, it prioritises serving pre-compressed .gz files over on the fly compression. That’s one way to keep the load on the server down.
Sync the artifacts to the production server. I use Unison for this task as it’s really good at syncing the minimal amount of data necessary. It’s maybe a little overkill in this scenario as I could’ve used a simple rsync. The deployment is really only going one way to the server, so I can’t take advantage of the great two-way sync that Unison offers. Ah well, it was installed and set up on the server, so why not use it.

The deployment scripts are broken down into to two scripts, one that builds the artifacts and applies the necessary tweaks like compression and generating the additional feed directories, and the deployment that triggers the build script and deploys the updates if the build script succeeded.

Build script

Let’s look at the build script first:

#!/bin/sh

GZIP="/usr/bin/gzip -n"

rm -rf public/*
hugo --minify --theme="hugo-future-imperfect-slim"

if [ "$?" -eq "0" ]; then
    cd public
    mkdir feed
    cp index.xml feed/index.xml
    cp index.xml feed.xml
    cd categories
    for directory in $(find . -mindepth 1 -maxdepth 1 -type d);
    do
  	  cd $directory
	  mkdir feed
	  cp index.xml feed/index.xml
	  cd ..
    done
    cd ..
    mkdir -p category/emacs/feed
    sed 's/categories\/emacs/category\/emacs\/feed/g' categories/emacs/index.xml > category/emacs/feed/index.xml
    for html_file in $(find . \( -name "*.html" -o -name "*.xml" -o -name "*.css" -o -name "*.js" \));
    do
        #echo "Compressing $html_file"
        $GZIP -k $html_file
    done
    $GZIP -k index.json
    exit 0
else
    exit 1
fi

Overall the script is pretty straightforward. The main subtleties in there are that I use gzip -n - if you don’t, gzip will embed the filename and file time of the compressed file in the compressed file, which means that every generated gz file has to be synced because it’s different from its previously generated incarnation, even if the contents has not changed. Using -n instructs gzip to not embed this information, which ensures that we don’t have to sync files that are otherwise identical to the version on the server. I also need to use the -k parameter for gzip to make sure that the resulting artifacts include both the compressed and uncompressed versions of the file. Without -k, gzip will compress a file and then delete the original, which is not what we want.

NGINX is also configured to support Brotli compression. It’s currently applying Brotli on the fly if a client requests it, but I’m planning to also generate pre-compressed files for Brotli compression.

Deployment script

The deployment script resides outside the Mercurial repository at the moment and is very simple - it calls the script above and if the build script exits successfully with a 0 exit code, it kicks off the sync process:

#!/bin/sh

cd ~/blog-build
hg pull -u
./build-prod-site.sh
if [ "$?" -eq "0" ]; then
    unison -auto -batch public ssh://<web-server>/<deployment-directory>
else
    echo "Build error, skipping sync"
fi

Not much to the deployment script. Obviously I need to have appropriate ssh authentication set up so the process can run unattended as the deployment script is executed once a day from cron. I can - and do, depending on when I publish a post - run the deployment script manually. One check that I am thinking of implementing is to check if Mercurial actually updated any files and only trigger the build script if it did. In my opinion that’s a nice to have though as the scripts work perfectly fine as they are and do what they’re supposed to do.

So all in all, two pretty simple scripts that make life easier simply because it’s one less thing I need to think of.

The Lone C++ Coder's Blog

Static site migration - how I automated the static Hugo build and deployments for the blog

Build script

Deployment script

Recent Posts

If you get this error from Time Machine on Samba, check available disk space

Don't forget to set the home directory for Emacs on Windows

How to install WSL on Windows 11 without a default distribution

How to build/upgrade emacs-mac using homebrew

Migrating source code from RCS to Mercurial

Categories

About