neurons are firing again! How I eat my own dogfood with my blog.
First, huge shout out to the five of you that actually read stuff from my WordPress blog!
You might have noticed that, with the exception of a a few posts for my company, I’ve been mostly silent. I haven’t given up on writing; quite the contrary, in fact! I haven’t run out of neurons to fire either (darn!). I’ve stopped writing on my own site for one reason: I pledged to not write again until I was completely out of WordPress. Now that I am, I can write again!
Why do you hate WordPress so much, Carlos?
Because principles.
I’ve been using WordPress for writing blogs since I decided to start writing back in 2015. (I actually started using it in 2013, but I didn’t really commit to writing on a semi-regular basis until two years later.) It’s always been a love-hate relationship, mostly erring on the side of hate, for a few reasons:
-
The editor is clunky and takes forever to load over slow connections (this was a massive deal in the dark ages when American and Delta were using cellular-based internet in their planes and breaking past dial-up speeds was a huge surprise.
-
The code generated by their editors is all sorts of non-standard.
-
I enjoy cross-posting my posts onto LinkedIn Pulse. It was never a copy-paste operation, because of the non-standardness mentioned above. It often took about an hour to get my WordPress and LinkedIn blog in sync, and even then small things weren’t perfect.
These were annoying, but not dealbreakers. The deal breaker that finally pushed me over the edge was this: I had to pay WordPress $13/month to use a custom domain backed by HTTPS.
$13/month.
“That’s really not a lot, Carlos.” You’re right. But think about it:
-
I’m paying them $13/month for what is probably a five-second change on their webserver.
-
I’m paying them $13/month for the (possible) privilege of using a wildcard SSL certificate, so it’s not like I’m even telling readers that I own that blog.
-
I’m paying them $13/month because I was too lazy to roll my own solution, despite being perfectly capable of doing so.
It’s not the money, here; it’s the principle. I’m an Engineer. Why the hell am I using a chunky, JavaScript-heavy CMS to host my musings when I can use something lighter?
The Goal
This was what I wanted my blog to be:
- Powered by Git: Every post to be version-controlled in GitHub.
- Markdown everywhere: Every post is pure, high-grade, unadulterated Markdown.
- Light and fast: The blog would be super fast anywhere and extremely light on JavaScript.
- No ads: The blog would have no ads, anywhere, ever.
- Locally testable: The blog can be tested locally and will look and behave the same on the web.
- Easy to maintain: Adding anything to the blog, including SEO for discoverability, would be easy.
- CHEAP AF: The blog would be stupid cheap to run.
- REPRODUCEABLE AF: The blog would be stupid easy to reproduce anywhere, any time.
- CI/CD for everything: The blog will be deployed via CI whenever I post new content.
Essentially, I wanted a fast, super simple blog driven by CI and Git Flow.
Hello, Hugo
I knew that Git-friendly blogs like Ghost and Jekyll existed, but I didn’t really know the landscape well. So I went looking and stumbled upon Hugo within about 32 seconds.
Hugo is a Golang-powered static website generator with a focus on blogs. While you can create just about any kind of website you put your heart into with it, it is really good at creating blogs. It’s basic premise is three-fold:
- Almost everything is comprised of a hiearchy of HTML files.
- Every post has “front matter” that describes it.
- Everything is configurable through a configuration file (
config.toml
in my case)
Websites powered by Hugo can be hosted by Hugo through a small Go web server or statically generated and copied onto the web server of your choice.
Perfect for a WordPress hater like me.
Serverless all the things!
So my natural inclination was “let’s do serverless!”
My thoughts were:
- I don’t get a ton of traffic, so why not use Lambda or Azure Functions to render the site on demand?
- I get to finally learn serverless and sound cool at talks.
But digging not too deeply into this hole proved that this didn’t make any
sense. While I did find a pretty awesome
guide
on how to do this, it was way more complicated than I originally intended.
Additionally, Hugo generates static websites. Just-in-time rendering didn’t
make sense when a simple local render and S3 sync would do. While serverless
will probably be much more useful for automated chores (like purging old
index.html
files since they are versioned; more later), using it for core blog
workloads didn’t make sense.
How about S3?
Using AWS S3 and S3 Static Web Hosting made much more sense. Using this would enable me to build the blog locally, test it with Hugo and sync the files up to a designated bucket. The only negatives with this approach are:
-
Websites hosted out of S3 do not support custom SSL certificates. The only way to work around this is by using a CloudFront distribution and making the S3 bucket an origin. While this does satisfy the “make it fast everywhere” goal, it does complicate my infrastructure a slight bit, and it slightly increases cost.
-
It isn’t possible to do “clean” integration tests, since the Hugo web server makes different assumptions than S3’s web server. Now that I think about it, though, I’m not sure if clean integration tests were ever possible since we can’t locally bring up a S3 web server.
Automating all of this!
Terraform, Docker and Make, of course!
Terraform all the things!
I could have used AWS CloudFormation to provision all of this, but I wanted to use something that would make it easy for me to host blogs on multiple clouds. While I’m using AWS for this now, I intend on getting an Azure certification this year, and hosting this blog on it will be an easy way for me to study. I’ve also been using Terraform for many, many years, and absolutely love it. There is no easier way of provisioning infrastructure (though Pulumi is looking very interesting), and it supports just about every cloud provider out there.
Docker all the things, too!
I could have created provisioning scripts that install Hugo and other tools required to provision my blog, but why do that when I can use Docker instead? Containers make it really easy to deploy and run applications in consistent environments anywhere. Docker Compose makes it easy to link multiple containers with each other and run containerized tasks repeatedly. Combining the two made provisioning super easy!
Docker in Docker, though.
The biggest challenge with this was dealing with the Docker in Docker problem. While nested containers aren’t bad per se, accessing networked services, like my local Hugo server, or mounting volumes on them basically required host-mode networking and passing paths on the host downstream. This can introduce security vulnerabilities, as it allows containers (with root access) to bind processes onto real ports on the host machine.
Additionally, starting Docker Compose or Docker from within nested containers requires that the
docker
and docker-compose
binaries be present in the container’s $PATH
.
I develop on a Mac, and OS X doesn’t allow you to simply mount these binaries on
your computer since the Docker for Mac daemon treats the directories they live
in as “special” directories that can’t be volume-mounted. To work around this, I
have to install Docker and Docker Compose on some containers when they start up.
Not ideal, but it works.
Make all the things, too!
Choosing a build runner for this code was slightly more complicated. I generally err to writing Bash scripts in the first instance, as I know Bash well (despite it being a poor language compared to other scripting languages like Python or Ruby), I can count on it being available on just about anything and linting/testing it is easy to do with ShellCheck and BATS, respectively.
However, I like using Make since it is slightly more portable and it is
somewhat friendlier for actually building stuff with that doesn’t product
explicit build artifacts. I’m guilty of writing really, really
complicated
Makefile
s though, so I wanted to be really careful about making my Make
structure easily readable and approachable this time.
Handling secrets
Committing sensitive cloud provider API keys would be pretty bad. However, I didn’t want to set up a Vault cluster and worry about that “chicken-and-egg” problem. So I resorted to a simpler approach: environment dotfiles in AWS S3 and an example dotfile committed to Git.
While dotfiles have the “but you can store them insecurely!” problem, they are easy to reason about and easy to import into Make and other tools.
CI/CD
I decided on using Travis CI for CI/CD. I’ve been using
it for many years and have generally been happy with it. I like being able to
simply drop a .travis.yml
into my repository describing my build, test and
deploy stages. It is way easier to use and way harder to abuse than
the Jenkinsfile
s that I’ve seen and written in the past for Jenkins. Their CLI
tool is also great, albeit buggy, and allows me to monitor and invoke builds
without having to use the website.
The final product!
After many weeks of iteration and failure, bloggen was born! Here’s how it works! It’s not the friendliest process, but it’s a start.
-
Create a Git repository containing a Hugo blog with the standard Hugo directory hierarchy and files.
-
Add a
config.toml.tmpl
file. This is a gomplate-templated version of Hugo’sconfig.toml
. This allows for more flexibility, such as using environment variables to define site properties. -
Clone
blog-gen
to your blog’s Git repository. Ensure that you add the directory to your.gitignore
to avoid committing a reference to it. (Git will not commit sub-repositories by default unless they are added as sub-modules). -
Copy
.env.test
from yourblog-gen
repository to your blog’s repository. Fill in the dummy values and save as.env.
-
Create a Docker Compose file that looks something like this inside of your blog’s repository.
-
Create a
.travis.yml
that looks something like this inside of your blog’s repository. Make sure that you activate the project within Travis first and define these environment variables:AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_REGION
DOTENV_S3_BUCKET
(the S3 bucket containing your “production”.env
)
-
Run
docker-compose up start-local-blog
to start a local instance of your blog. Visithttp://localhost:8080
to see it in your browser. (The port can be changed from within your.env.test
file.) -
If you like how it looks, commit and push your changes. Travis should build the build, create the S3 bucket and CloudFront distribution and deploy your blog to the S3 bucket. Provisioning it for the first time takes about 20 minutes because CloudFront is slow.
Underneath the hood
There are a few moving parts within blog-gen
. Here’s how they work:
-
It looks for a
.env
with environment variables describing the Hugo website to be provisionined, the provider the blog is being deployed to and the location to store sensitive data such as Terraform state and passwords. -
If the
.env
isn’t present locally, it tries to fetch it from cloud storage (S3 at the moment) defined by an.env_info
file.bloggen
exits if this fails. -
The
.env
file contains anINFRASTRUCTURE_PROVIDER
environment variable. Terraform code withinblog-gen
is broken up by cloud provider and stored within theinfrastructure
directory. This environment variable selects the directory to use within this directory. -
make deploy_infrastructure
initializes the Terraform code found above, generates a Terraform plan andapply
s it. Terraform is executed in a Docker container, and the version of the image selected is defined within an environment variable. The intermediate Make steps that do all of this work are defined in Makefile includes within theinclude/make/terraform
folder. -
make deploy_blog
runs next. This uses Docker Compose via intermediate Make build steps withininclude/make/hugo
andinclude/compose/hugo
to build the Hugo blog and deploy the files generated by it to S3 (or a different cloud provider per$INFRASTRUCTURE_PROVIDER
).Hugo generates an
index.html
file with a summary of blog posts. Because it can take some time for CloudFront and local browsers to pick up changes to this file due to the nature of caching, we have to invalidate this file whenever we make changes to it. I could have used CloudFront Cache Invalidations to do this, but I only get 1000 of those per month and it can take time for the invalidation to complete.A much simpler and more instant solution is versioning
index.html
by appending commit SHA’s to the end of the file. When clients try to request the file by going to https://blog.carlosnunez.me, the distribution won’t have the file immediately available and will fetch the file from S3 directly. This introduces some latency for fresh new posts, but that latency goes away after the distribution has been updated (about 20 minutes). -
Travis checks for liveliness after the blog has been deployed by trying to fetch it with
curl
for a given amount of time (three minutes by default).
Mission Accomplished!
Due to the pipeline above, this blog post is:
- Written entirely in Markdown,
- Vetted locally with a local Hugo server,
- Stored in Git,
- Deployed and tested via CI, and
- Globally and securely available via CloudFront with my own HTTPS certificate.
On top of all of that, it takes about two minutes to release new blog posts, and new content is immediately available everywhere after that.
I’m pretty happy with this dogfood!
The Future
blog-gen
is in really early days. I don’t have a lot of confidence in it
working for anyone but me right now. (I do want people to break it, though!)
There are a lot of things I want to improve on!
Blue-Green Deployment
I originally wanted to use blue-green deployment so that I could always have a
“beta” version of my blog available. This would have been a good place to test
reactions to content without it getting indexed on Google, for example. I didn’t
know how to go about it when I was developing blog-gen
, so I decided to punt
on it and do rolling releases.
Rolling back is manual at the moment, which I don’t like. Ideally, I would run a job in Travis that:
- Gets the SHA for my last commit, and
- Runs
make deploy_infrastructure
using that commit SHA (since it handles generating the correctindex
anderror.html
files and updating the S3 bucket in kind).
But that’s a manual job right now, and that makes me sad.
Serverless Chores
As you’d imagine, I have a ton of index.html
files lying around. I don’t need
them all, and regenerating them is really easy (as descirbed above). It would
be nice to purge them every so often.
Serverless wasn’t perfect for my blog, but it’s perfect for this! I would love a few Golang-written Lambda functions that take care of this!
Private Content
While I don’t plan on doing this right now, I would eventually like to add restricted content. Reviewing client-sensitive content for Contino without releasing client-confidential information would be a good example of this.
Lambda@Edge is a great tool for this purpose. It allows CloudFront maintainers to run Lambda functions against the distribution instead of at the S3 bucket level. This can help when dynamically re-writing addresses or doing more complex routing than what CloudFront offers out of the box.
I would like to play with this.
More Cloud!
I currently only support AWS. I want to support Azure and GCP, largely so that I can finally smack talk them with authority. I’ve designed my blog to be multi-cloud; I just need to implement it!
TL;DR
Guy gets tired of WordPress and builds his own thing with Hugo. It does the DevOps. Guy is happy.