As you may know, we’re building a platform for the delivery of city government services, available for public feedback right now at alpha.phila.gov. We’re starting small but thinking big, so we’re paying some attention now to devops and our overall deploy process.
In planning for the development and deployment of alpha.phila.gov I set a couple of requirements:
- Automatic deployment via push to GitHub.
- Infrastructure as code, also committed to GitHub.
These give us continuous delivery on a stack that can be easily modified and reproduced across production, staging, and testing environments.
The city’s Office of Innovation and Technology (OIT) has been steadily moving services to Amazon Web Services (AWS), so I explored our options there. It’s more flexible in development stages than running machines internally, and provides a clearer path for scaling up as we grow into production.
At this point in the evolution of AWS, the users of Amazon’s services suffer from a paradox of choice. Among the deployment management options, we evaluated Elastic Beanstalk, CodeDeploy, OpsWorks, and rolling our own with straight EC2. For the reasons stated below, we eventually decided on OpsWorks, though we used CodeDeploy for a month while I set up our cookbooks.
While I know my way around a Linux box, I had never written a Chef recipe before, so it was tempting to stick with CodeDeploy and simply script the server setup with Bash. With OpsWorks, however, we have a more conventional way to set up configuration as code that can be committed, reviewed, and rolled back when necessary (see those recipes at github.com/CityOfPhiladelphia/phila.gov-cookbooks).
The UI for OpsWorks is also a cut above the rest of AWS, with very pretty overview layouts to gain an understanding of the whole stack at a glance. It really is a great feeling to add an instance to a layer in the stack and know that within a few minutes a new machine will be configured behind the load balancer, connected to the database, accepting requests and returning responses.
The local development environment
To set up a local development environment, all one has to do is clone the repo at github.com/CityOfPhiladelphia/phila.gov and run “vagrant up” in that directory. That process relies on Vagrant, a great tool for development.
Our Vagrant setup uses a simple bash script to approximate our production setup. I would like to use the Chef cookbooks for Vagrant as well, but the specifics for AWS (relying on deploy variables and connecting to the database at RDS) have delayed that across-the-board consistency. In addition, enough of our team has issues with Vagrant and VirtualBox on their machines (Windows and old MacBooks) that I’ve been investigating another route: EC2 instances in an OpsWorks testing stack belonging to each developer. This involves a slightly more complicated development setup (maybe using sshfs to edit those remote files), but gives us parallel environments between production, staging, and testing that are easy to keep in sync.
WordPress and Composer
We have separate repos at GitHub for our custom WordPress theme and plugin for alpha.phila.gov. In development, those repos are checked out within the main phila.gov repo and commits are pushed from there. When a developer on our team (usually the awesome Karissa Demi) is ready to test new code on staging, the following steps are taken:
- Push a new release for the theme or plugin
- Switch to a “clean” checkout of the phila.gov repo
- Run “composer update” to pull in the latest versions
- Commit the new composer.lock file into the staging branch on the phila.gov repo
The third and fourth steps above are made possible by Composer, a version-locking dependency manager for PHP. With Composer we can define dependencies in the repo, enabling us to manage our WordPress stack with committed code.
Getting WordPress to work with Composer involves modifications to the wp_config, such as overriding the WP_CONTENT_DIR. A number of other configuration values are set via environment variables, which allows us to use this public repository in our deploy environments. The update command also pulls in the latest releases of all of our third-party plugins, and can even update WordPress itself.
Once that composer.lock file has been committed, a push to the staging branch on GitHub will make any changes live on our staging stack at OpsWorks. This is managed by Travis CI, which sees any push to GitHub and tells OpsWorks to run a deployment. See our .travis.yml for configuration details.
Once the team has reviewed staging and decided the changes there should land in production, we create a pull request at GitHub between the staging and master branches. After that merge is complete, the same process involving Travis CI and OpsWorks handles the production deployment.
I’ve been pleasantly surprised by OpsWorks. Little details make all the difference. For example, when I ssh into a testing box to land a hotfix or connect to the DB, the MOTD includes all of the machine information, so I know I’m in the right place before I start hacking away. Another nicety is that it’s easy to copy stacks in OpsWorks, which I’ve done to create our parallel environments. Also, while I took some time to climb up the Chef learning curve, our cookbooks ended up containing less than a hundred lines of code. I managed this on two assumptions:
- They rely on the built-in recipes at OpsWorks to do much of the basic machine setup, such as user accounts and database connections, and
- All of our machines are Ubuntu 14.04. Package names and configuration file locations are therefore specific to Ubuntu.
There are a few caveats, however. Those new instances that are so easy to spin up without any manual intervention take an absurdly long time to do so. Deploys are also slow due to all of the back and forth between GitHub, Travis CI, and OpsWorks. Finally, I fear that even though it relies on industry standards like Chef, OpsWorks encourages vendor lock-in because we get used to its way of doing things. That’s what a good hosted product should do, though.
We have yet to test the assumed scalability of this setup, and we’re just starting to build out our resource monitoring, but we’ve already reaped some benefits. One example is the recent launch of alpha.phila.gov/property. A small change to the nginx configuration in our cookbook set up the proxy for GitHub Pages. That change was deployed to staging and reviewed there. Once we were fairly confident it worked, we updated the cookbooks in production, and the service went live without a hitch.