Making Jenkins Great Again

Within the test engineering department here at TrueCar, we’ve tampered and experimented with CI practices for years.  It started, as it does for most, with a hacked-together Jenkins server.  It served its purpose in giving us a server to run scheduled and ad-hoc test jobs in support for the products we provided coverage for.  Nonetheless, it was a brittle server created by a long-gone admin.  When the Jenkins box had issues, our team was usually left to fix it ourselves (this happened often).  When tests did run successfully, jobs usually would take 2-5x longer than a test run on a local machine.

It goes without saying that working with Jenkins was something that few dared to do, and others (myself included) did so with great reluctance.  Despite my reluctance, I made the most with what I had and ended up learning Jenkins the hard way.  In fact, without this painful time of maintaining the old server, I can safely say that I would not have known what to do if we started from scratch.

In time, we began to transition our infrastructure into AWS.  During this time, I was given a fresh new cloud-hosted Jenkins box dedicated solely to testing.  Between what I learned and liked from the old box, and applying suggestions from our DevOps team, I began configuring the new box.

The following features and practices are noteworthy of the new Jenkins server…

  • Okta Authentication – Okta is not perfect, but it works very well with Jenkins regarding auto-login.
  • Roles and Permissions Management – Our old Jenkins box had no restrictions on who could configure what.  On the new box, I am able to configure roles at the job level.
  • Job name regular expression restrictions – By building a restriction into the types of job names that are allowed, I was able to define roles and permissions at the job level.  Also, if jobs follow specific naming conventions, it’s easier to process them in a programmatic way.
  • AWS Slave auto-provisioning – With our testing strategy, we can’t be limited by a set number of slaves.  If our build queue gets backed up, we need slaves to be auto-provisioned to meet the need.
  • Nightly Backups – The days of making scary changes on Jenkins are over.  The worst-case scenario now is losing a day’s worth of work.  This has already come in handy a number of times for us.
  • Docker Imaging – We have a Jenkins job that runs nightly, which builds a new base image with our code and dependencies prepackaged on it.

The following plugins were also notably helpful and, hence installed on our new box...

  • Matrix Project Plugin – This is a very important piece of our strategy.  With this plugin, we can parallelize our jobs without fear of collisions.
  • Dynamic Axis – This is a supplemental but important plugin to allow us complete flexibility in parallelizing tests.
  • Environment Injector Plugin – This was critical in passing environment variables across multiple bash environments.
  • Conditional BuildStep – We need this plugin solely to run our post-build steps on the matrix job (rather than on the individual axes).
  • Amazon EC2 plugin – This plugin is necessary if we are to build our architecture around ephemeral slave nodes.
  • Template Project Plugin – This plugin allows us to store configurations in a job that serves as a template that can be cloned for the simple creation of other jobs.
  • Build Monitor Plugin – There are many flavors of high-visibility dashboards available, but this is the one that we’ve kept with.
  • Job Configuration History Plugin – This is an administrative must-have.  Even with a subset of administrative hands touching your Jenkins box, mistakes can happen, and this plugin helps us track them down and correct them.
  • Role Strategy Plugin – This plugin allows for permissions management at a micro level.
  • Slack Notification Plugin – Essential for job status notifications.
  • Timestamper – This plugin puts timestamp information on our console output.

I will discuss some of these plugins in more depth within later posts.

To close this milestone, I want to share a few recommended practices for working with Jenkins.  It would still be possible to achieve the end goal without following these practices, but the level of difficulty is magnitudes greater.

Use Job Templates

Prerequisites: Template Project Plugin

Utilizing job templates has many benefits.  For one, it allows us to standardize the behavior of all jobs that are cloned from it.  It also gives us the ability to offer flexible functionality by using multiple templates, each with a specific set of behaviors.  It lends itself to greater usability by reducing time and complexity in building jobs to minuscule levels.

Job templates don’t solve every problem.  For example, if we need to make a change in a template that needs to be retroactive to all existing jobs, we would still need to make the change to every existing job as well.  It's best to think of it more like snapshotting than inheritance.  With this in mind, we can offer versioned templates to our users simply by using an appropriate naming convention.  We then send out “release” emails to our users, encouraging them to create new jobs from the latest template with a list of new features offered.

Source Control

Prerequisites: Git Plugin (or equivalent SCM plugin)

For many years, we wrote our bash scripts for test runners and other functionality directly into Jenkins jobs.  This was a nightmare to maintain; between copy/pasting code per job to the risk of losing code if someone else configured or deleted the one job that housed this code.  It’s simply a terrible practice.

While working with the new box, we came to the revelation that we should put all of our Bash and Docker code used within Jenkins into a separate repository.  It’s worth mentioning, that we did not put this code in our existing test repositories that ran within Jenkins.  Rather, we created a separate repository for this setup code.  Through the course of this blog post, I will refer to them as the test repository and the assistance repository, respectively.

Dockerize Your Environment

Prerequisites: Docker must be installed on Jenkins (with access to the image repo)

If you were to do nothing else in this post, utilizing Docker in your Jenkins jobs would greatly improve the speed of your test runs.  One of the biggest hang-ups in running tests in Jenkins is the setup of the environment that must occur before every test run (unless you share workspaces and environments between jobs, which can come with their own set of problems).  If you set up a nightly build that clones the latest copy of your repo, and bundles its requirements in a docker image, all you’d have to do is pull that image into your slave, activate a container, and run your tests in that container.  Of course, you’d want to do any last minute branch pulls and bundles before running the tests, but this is more of a quick addendum than a slow and complete setup.

We’ve actually taken this a step further by having two jobs that build containers.  One job will create a container with the version of ruby we currently use and any packages we want pre-installed.  This job is kicked off manually (since updates at this level are rare).  A second job runs nightly and builds an up-to-date image that clones our test repository and bundles its dependencies for use in future containers.  This setup allows us to update our version of Ruby with little effort and keep project setup time at a minimum.

Refactor Often

Prerequisites: Patience and experience

Anyone who has developed within Jenkins in any capacity knows that it’s a fickle thing that usually degrades to a hacked-together solution driven by experimental code.  This is usually because once you get it working in Jenkins at any level, you won’t want to touch it anymore for fear of re-breaking it.  Sadly, this usually only leads to more problems down the line.

If you abstain from putting your code directly into Jenkins and into an SCM, refactoring your Jenkins setup becomes less terrifying and can even be an enjoyable experience.  Without this step, it is hard to pursue enhanced functionality within Jenkins, but with it, much more is possible.

 

With this first milestone complete, we are already in a better situation, but this is really just the jumping off point for optimizing our tests to run in a quick and scalable way.  In our next milestone, I am going to discuss the details involved in test parallelization within our Jenkins builds.