Test Parallelization in Jenkins

In this milestone, I’m going to discuss the details involved in setting up our Jenkins builds to run all tests in parallel.  The solution is custom-built despite other already-built tools out there that achieve similar results.  One of the problems I’ve faced when using some of the existing parallelization tools out there is thread collisions between tests.  One way this comes out is in the scrambling of logs and the reports that housed those logs.  Logging and reporting are critical aspects of testing here at TrueCar, so I had to seek and implement a solution that kept our logging and reporting intact.

Our origins in parameterization began with a manual bucketing strategy that was driven by a strict tagging system that we adopted.  Tests were tagged based on the product they were a part of, along with the actual feature they focused on.  We defined a bucket as a collection of tests that matched the specified tags.  Within a single bucket, each test would run sequentially.  Each bucket would reference a different collection of tests to be run in parallel with other buckets.

This was a step up from our initial sequential execution strategy, but it definitely had its limits.  It all depended on how the test maintainers tagged their tests.  If tags were improperly specified or utilized, some tests could be run as part of multiple buckets.  Another issue is that if one bucket had a particularly long test in it, it would slow down that bucket and unbalance its completion time in relation to other buckets.  Ultimately, a build was only as fast as the slowest bucket, which impacted overall execution time.

In time, we needed to explore another avenue; one that removed the human element and parallelized every individual test case into its own bucket automatically.  With this strategy, a build would be as fast as the slowest test case.  Working with the manual bucketing strategy within Jenkins showed us that it was possible; it just required a bit more design and development.

Prerequisites

  • Jenkins Plugins (Matrix Project Plugin, Dynamic Axis, Environment Injector Plugin, Conditional BuildStep)
  • Testing Framework (Dry-run capabilities, JSON reporting to file, execution by test class/name)

Before any Jenkins work is done, you need to make sure that the testing framework that you are working with has the above-stated capabilities.  If it doesn’t (as in the case of minitest), you will need to build this functionality in.  If you happen to be using Cucumber then you are good to go.

A little note on Minitest setup:

Minitest is very easy to add customized functionality to, since it’s very lightweight.  For individual test case execution support, I patched minitest to read in an environment variable named TEST_NAME, which would take the full test method name with an optional test class name (to avoid collisions).  The patch would check the TEST_NAME, and only add the test case to the execution queue if it matched the test case.  Not only will this help us with our goal, but it’s also a great feature for regular use!

For the JSON reporting component, I ended up using the ‘minitest-reporters-json_reporter’ gem within my bundle and patching it to generate an actual JSON file (which it doesn’t do out-of-the-box).  While I was in there, I added the value of the class/method name that I would end up using with the TEST_NAME functionality I built.  This will make the processing I need to do in the Jenkins job much easier later on.

Finally, I needed to add the dry-run component.  One last monkey-patch of the minitest module allowed me to have it watch for an environment variable called DRY, which if true, would cause the define_method call that minitest uses behind the scenes to register test cases to run to pass in an empty block, instead of the block that holds the test code.

Building the Init Script

My first step was to build the bash script that would perform the dry-run, process the JSON report, and define the individual buckets.  With this setup, we remain heavily dependent on a smart tagging strategy.  Even though manual buckets are a thing of the past, we still want to control which tests should and shouldn’t get run as part of a build.  The init script should perform the dry run, using any tags that may have been passed into the build.

## assistance/jenkins/parallel_init.sh
source ./jenkins/common.sh

# This function exits the script early for parent jobs
PROPERTIES_FILEPATH="$(pwd)/${BUILD_TAG}/env.props"

# This block runs if current job is a axis and not the matrix
if [[ "${JOB_NAME}" == *"TNAME"* ]]; then
   # Since properties file is expected to exist (even
# for axis jobs) create an empty one

   rm -f ${PROPERTIES_FILEPATH}
   touch ${PROPERTIES_FILEPATH}
   exit 0
fi

docker run
   -v ${JENKINS_MOUNT_DIR}:${DOCKER_MOUNT_DIR}
   -e init='true'
   
   -e test_repo_br=${TEST_REPO_BR:-master}
   -e docker_mount_dir=${DOCKER_MOUNT_DIR}
   
   -e tags=${TAGS}
   --rm
   your.docker_registry.com/your-image
   /bin/bash -c -l "
 source ${DOCKER_MOUNT_DIR}/assistance/jenkins/common.sh

 setup_and_run_minitest
"

# Take the names of the test cases that ran and add them
# to the properties file

process_json_file

As you can see, Docker was used to execute the dry-run.  This allows us to utilize the pre-packaged environment, which saves some time in setup.  Nevertheless, using an init script adds extra execution time, so we want to make it as quick as possible.  One caveat to using Docker is to ensure that you mount your directories wisely.  This init script runs before a copy of our SCM is cloned into the workspace (which means, a temporary clone of the assistance repo must take place), so our script points the mounted directory to the same location as the temporary SCM clone.  The other thing this script does is explicitly pass in all needed environment variables into the Docker container.  All variable exporting must take place within the Docker container.

## assistance/jenkins/common.sh
function setup_and_run_minitest() {
 git checkout ${test_repo_br}
 bundle

 # This condition marks the only difference between dry
# runs and real runs

 if [ "${init}" = "true" ]; then
     export TAGS=${tags}
     export DRY=1
 else
     export TEST_NAME="${test_name}"
 fi

 export REPORT_PATH=${docker_mount_dir}
 rake specs
}

function process_json_file() {
 echo "TNAME=$(ruby -e "
   require 'json'
   data = JSON.parse(
File.read('${PROPERTIES_FILEPATH}/report.json'))
   # puts <Your space-delimited test cases that will be
# recognizable by TEST_NAME>

 ")" > ${PROPERTIES_FILEPATH}
}

Finally, at the bottom, you’ll notice a function that processes the JSON report.  This function reads in the JSON file and processes every test case that was run as part of the dry-run.  The TEST_NAME is obtained from within the metadata (which was setup during my patch of the json-reporter gem) for each and returned as a space-delimited string, which is then stored in a properties file to be used for bucket creation (more on the properties file later).
It’s worth mentioning that all code in this post has been generalized for the sake of addressing the overall concept.  That being said, some details have been lost that you will likely need to consider when attempting to reproduce this work.  For example, when working with Bash, it’s necessary to ensure that you encode values that have spaces in them so that you can easily use spaces as delimiters for looping through values.  Despite the missing details, you should be able to fill in the gap as you grapple with Bash.

Test Runner Script

For this script, you can actually use a lot of the same code referenced above.  Simply omit passing in the “init” and “tags” environment variables and pass in the TEST_NAME that was obtained during the JSON file processing in the init phase.

## assistance/jenkins/parallel_execute.sh
source ./jenkins/common.sh

docker run 
    -v ${JENKINS_MOUNT_DIR}:${DOCKER_MOUNT_DIR} 
    
    -e test_repo_br=${TEST_REPO_BR:-master} 
    -e docker_mount_dir=${DOCKER_MOUNT_DIR} 
    
    -e test_name=${TEST_NAME} 
    --rm 
    your.docker_registry.com/your-image 
    /bin/bash -c -l "
  source ${DOCKER_MOUNT_DIR}/assistance/jenkins/common.sh

  setup_and_run_minitest
"

Building the Jenkins Job

The following are steps that you can follow to create your job for parallelization…

1. Define all parameters needed.  This would include a TAGS and a TEST_REPO_BR parameter.

2. Click on the “Prepare an environment for the run” checkbox.

The Properties File Path should be built in a location that is unique to the job.  If you store this properties file in a common location to be shared by all jobs, you will run into collisions when you run more than one job at a time that uses this properties file.  I used the BUILD_TAG as a directory location to store the properties file as well as the temporary clone of the Jenkins assistance repo.  You may have noticed that the BUILD_TAG directory is also the one that we mount to the Docker container in the init script.  All of these things are essential!

Within the Script Content section, the init behavior is defined.  As part of that, the assistance repo must be cloned and the init script (defined above) must be ran.

3. Source Code Management should clone the Jenkins assistance repo (not the test repo).  The reason behind this is that all test repo interactions are done within Docker, and the Docker image already has the environment and code setup.

4. Setup the Configuration Matrix.  This is where you set the AXIS NAME to the output of the process_json_file function that we ran as part of the init script.  After the init script runs, Jenkins’ matrix plugin will create one axis per TNAME defined.

5. Execute your test runner as part of the Build section.  You should only have to add two lines to this, which will call the Bash script that was built in the previous section.

6. Apply any other configurations (i.e. notifications, artifact archiving, etc).

7. Save the configuration and give it a run.

Depending on how many tests you generally run in parallel, you may need to update the number of slaves along with the executors per slave in your Jenkins cluster.  Each axis that is created will run in a separate executor, which can result in flooding the build queue.  But, if your individual test cases are quick, it won’t be a problem to flood the build queue because an executor will be freed up the moment it completes the individual test and will then move on to the next one.

Further Discussion

Cucumber

The emphasis of this blog is on working with Minitest, but our initial solution was with Cucumber.  Setting up test parallelization to use Cucumber is very similar to what you’ve seen here.  It’s also easier, since Cucumber already supports JSON reporting, dry-runs, and execution of individual test cases (using the -l option).

Defining Another Axis

If you want to run every test case in parallel, but also against different configurations (i.e. run web tests against both Chrome and Firefox), you can specify additional axes in the Configuration Matrix section of your job.  A cross product of every axis will be run, each within its own executor, so you’re test run will take just as long, assuming you have enough executors to handle them all.

 

With the completion of this milestone, your tests will be as scalable as your Jenkins system allows.  With test builds taking as long as the slowest test case, and enough executors to run every test at once, builds that run a hundred tests can theoretically take as long as builds that run one test (with enough slaves and memory).  Of course, a balance must be struck between the amount of slaves/executors created and how many tests you want to run in parallel.  At TrueCar, we have leaned in favor of having more slaves and executors per slave.  AWS instances are cheap to spin up, and we don’t need anything more than a T2Large instance type to run 10 executors per slave.  The costs of slow tests are far more expensive than having a large number of cheap slaves spun up.

Our journey isn’t over yet, though.  With hundreds of tests running on their own individual axis, that means that we have hundreds of individual reports generated.  What happens if three out of a hundred break?  Should I search for each individual report in a list of hundred test runs?  The next milestone will focus on post-processing each of the completed tests’ reports and aggregating them into a single report that will allow you to quickly drill down into failures.