Gatekeeper Regression Triggers

At the closing of the last milestone, I discussed one of the applications for which we use parallelization today.  It takes the form of a Jenkins job that serves as a post-deployment process, which runs all tests for a given app within a given environment.  We call this job the Gatekeeper with the idea that if tests fail, the gate to the next highest environment remains closed.  The concept of the Gatekeeper goes hand-in-hand with the concept of Continuous Delivery.  Instead of scheduling feature-heavy releases, features get shipped to production when they are ready.  This means that we could have several deployments a day, any day of the week.  The key to Gatekeeper’s success is to run a growing collection of tests in less than a couple minutes using the parallelization features we built in the previous milestones.  TrueCar is now one step closer to adopting Continuous Delivery!

There was at least one more major optimization that we needed to build.  We wanted to make a distinction between how Gatekeeper handles critical path tests (CPT) and regression tests.  The specific functionality we are looking for is to have Gatekeeper's gate remain closed only for CPT failures.  From our experience, the more automated tests you have, the less likely that they all pass 100% of the time (especially web tests).  Lower-priority tests that fail for known reasons or in an intermittent way should not hold up our releases.  On the other hand, we didn’t want to refrain from running our regression tests altogether.  This is where regression triggers come into play.

With this new functionality, critical path tests will be run at the forefront of the Gatekeeper job, ensuring that critical functionality still works during deployment.  As a post build step, Gatekeeper will asynchronously kick off other Jenkins jobs that run the full suite for a given app.  They are asynchronous because we are not urgently worried about the result of the test run.  The test engineer just needs to follow up on the results of the regression jobs when they are finished.

The script that we built uses the Jenkins API to both read jobs from a dashboard view and trigger each one it finds in sequence.  Dashboard views need to be created ahead of time which are named in such a way to allow for programmatic access.  The tester can add any existing jobs to this view, which effectively includes the job into the Gatekeeper workflow.

Building the Script

The script uses a combination of Bash and Ruby to read jobs from a view, process each job’s parameters, build the individual payloads, and finally trigger each job through Jenkins’ API.

## assistance/jenkins/trigger_regressions.sh

# Exit out early if REGRESSION flag is not set
if [ "${REGRESSION:-false}" != true ]; then exit 0; fi

TOKEN="YOUR_SECRET_TOKEN"
env="${1}"

# Get a filtered list of all available jobs
url="<YOUR_JENKINS_URL>/view/Dashboards/${APP}_${env}/api/json"
export DASHBOARD_URL="${url}"
export GATEKEEPER_URL="${BUILD_URL}"

url_and_params=$(ruby ./jenkins/process_triggers.rb)

# Handle the case where the Dashboard URL doesn't exist
if [[ "${url_and_params}" == *"Dashboard URL not found"* ]]; then
   echo "${url_and_params}"
   url_and_params=""
fi

# Loop through each job in the dashboard and kick it off
triggered=""
for pair in ${url_and_params}; do
   parsed=(${pair//_-_/ })

   job_url=${parsed[0]}
   build_url="${job_url}build?token=${TOKEN}"

   params=${parsed[1]//-_-/ }

   echo "JOB URL: ${job_url}"
   echo "Parameters: ${params}"

   result=$(
curl -X POST ${build_url}
--data-urlencode json="${params}")
   sleep(1)
   if [[ "${result}" == *"Authentication required"* ]]; then
       temp="Token does not match for job:
<a href='${job_url}'>${job_url}</a>."

       temp+=" <strong>Job was not triggered!</strong>"
       temp=${temp// /^^}

       triggered+=" ${temp}"
   else
       triggered+=" ${job_url}${parsed[2]}"
   fi
done

echo "TRIGGERED=${triggered}" >> ${BUILD_TAG}/env.props
## assistance/jenkins/process_triggers.rb
require 'json'
require 'open-uri'

begin
 jobs = JSON.parse(open("#{ENV['DASHBOARD_URL']}?depth=1"
) { |f| f.read })['jobs']
rescue OpenURI::HTTPError
 puts "Dashboard URL not found. No jobs will be triggered!"
 jobs = []
end

url_and_params = jobs.map do |job|
 job_url = job['url']

 param_defs =
job
['property'].reject(&:empty?).first['parameterDefinitions']
 job_params = param_defs.map do |param|
   pair = param['defaultParameterValue']
   value = pair['value']

   value
= 'ci' if pair['name'] == 'TEST_REPO_BR'

   # Supply the Gatekeeper Build URL if this parameter
# is found (for reporting)

   value =
ENV
['GATEKEEPER_URL'] if pair ['name'] == 'TRIGGERER'

   { name: pair['name'], value: value }
 end

 supposed_next_build = job['nextBuildNumber']

 # Encode for Bash and return
 params_json = {parameter: job_params}.to_json.gsub(' ', '-_-')
 "#{job_url}_-_#{params_json}_-_#{supposed_next_build}"
end

Note how there is a secret token reference at the top of the Bash script.  This token must be set within the configuration of every job that you would potentially want to trigger.  If you are using job templates, be sure to update it there as well.

 

As you can see from the Ruby script, all parameters must be accounted for when triggering jobs via the API.  Although it adds overhead to our script, it also gives us the opportunity to override any of the default values.  If the job has parameters that govern the environment, app or Git branch, you will want to make sure that their values are set to be consistent with the Gatekeeper build that triggered the job.  In our example, we update the value of the test repository’s branch parameter to “ci” since our Gatekeeper always runs off of this branch.  We also added the parent job URL to a parameter called TRIGGERER, if it exists.  This parameter will allow us to point back to the parent job’s build from the triggered job.

Finally, the Bash script ends by setting a Bash variable called TRIGGERED, which holds links of all jobs that were triggered.  Both TRIGGERED and TRIGGERER will be used within our aggregated reports script (see below).

Updating the aggregated reports

With the addition of TRIGGERED and TRIGGERER, we simply needed to add a check for them, and create a new div in our report if one or the other shows up.  Since the triggered jobs are asynchronous, we simply want to provide links to the associated parent job’s build without any mention of results.

## assistance/jenkins/aggregated_report.sh (snippet)
regression_triggers() {
  if [ -n "${TRIGGERED}" ]; then
    echo "<div class='panel panel-default'>"
    echo "    <div class='panel-heading'>"
    echo "        <strong>"
echo " Asynchronous Regression Jobs Triggered"
echo " </strong>"

    echo "    </div>"
    echo "    <div class='panel-body'>"
    echo "        <ul>"
    for job_url in ${TRIGGERED}; do
      job_url=${job_url//^^/ }

      
if [[ "${job_url}" == *"Token does not match"* ]]; then
        echo "            <li>${job_url}</li>"
      else
        echo "            <li>"
echo " <a href='${job_url}'>${job_url}</a>"
echo " </li>"

      fi
    done
    echo "        </ul>"
    echo "    </div>"
    echo "</div>"
  elif [ -n "${TRIGGERER}" ]; then
    echo "<div class='panel panel-default'>"
    echo "    <div class='panel-heading'>"
    echo "        <strong>"
    echo "            This Job was Triggered by Gatekeeper"
    echo "        </strong>"
    echo "    </div>"
    echo "    <div class='panel-body'>"
    echo "        <a href='${TRIGGERER}'>${TRIGGERER}</a>"
    echo "    </div>"
    echo "</div>"
  fi
}

Integration

Adding the new script to our existing setup was pretty straightforward.  We simply added it right under the run of the aggregated_report_init.sh script as a post-build step within our Gatekeeper job.

 

With that, we now have an efficient testing solution that allows for maximum coverage while allowing Gatekeeper to hold back deployments for only the most critical of test cases.

 

 

There are other features that we can likely add to this setup.  For example, it would be great to build an aggregated report that is something more than a collection of links to other reports.  Nevertheless, we stand in much better shape today regarding our test strategy.  With test parallelization being the standard across all of our Jenkins jobs, test engineers at TrueCar are able to respond quickly to failures within our system.  I hope that this blog series can serve you in a similar way.