Speed up CI through impact analysis
Definition
What does impact analysis actually mean in terms of CI?
- in brief:
It’s about time-saving thing — we define what was impacted and skip unrelated actions (e.g.: building or testing).
- in practice:
We did a change to production code. We do analysis of what was affected. We will make a decision about what should be built and which tests should be executed depending on that analysis.
Surface impact analysis based on filters and triggers
Here we are using keywords in TRIGGER
and IGNORE
environment variables and running our tests that have been pre-filtered via tags/classes. This can be pretty helpful in monorepo world.
-
Create a script (
check_changes.rb
) that will be responsible for which files were impacted:require 'json' required_keys = ENV['TRIGGER'].split(',').map { |key| key.strip } ignored_keys = ENV['IGNORE'].nil? ? [] : ENV['IGNORE'].split(',').map { |key| key.strip } response = JSON.parse(`curl -s -H "authorization: Bearer #{ENV['GITHUB_TOKEN']}" -X GET -G #{ENV['PULL_REQUEST']}/files`) impacted_files = response.map { |file| file['filename'] } impacted_files.select! do |path| required_keys.any? do |required_key| should_key_be_considered = ignored_keys.none? { |key| path.include?(key) } should_key_be_considered && path.include?(required_key) end end puts impacted_files.size
-
Create CI pipeline that will be responsible for providing any required info to the script above and will work depending on the output:
name: Sample on: [pull_request] jobs: first_project: name: FirstProjectTests runs-on: [macos-latest] steps: - uses: actions/checkout@v2 - name: Runner env: PULL_REQUEST: ${{ github.event.pull_request._links.self.href }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} TRIGGER: "FirstProjectRelatedFolder/, VeryImportantFolder/, Fastfile" IGNORE: "FirstProjectFolder/tests/, .md" run: | if [ $(ruby ./.github/scripts/check_changes.rb) = 0 ]; then echo "Required files were not touched" else fastlane first_project_tests fi second_project: name: SecondProjectTests runs-on: [macos-latest] steps: - uses: actions/checkout@v2 - name: Runner env: PULL_REQUEST: ${{ github.event.pull_request._links.self.href }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} TRIGGER: "SecondProjectRelatedFolder/, VeryImportantFolder/, Fastfile" IGNORE: "SecondProjectRelatedFolder/tests/, .md" run: | if [ $(ruby ./.github/scripts/check_changes.rb) = 0 ]; then echo "Required files were not touched" else fastlane second_project_tests fi
Deep impact analysis based on code coverage tools
Here we are using code coverage tools to generate an impact map and running our tests that have been matched against the impacted files.
As nothing changed since Paul Hammant has breathtakingly described it, I’ll just leave a link here:
The Rise of Test Impact Analysis
And make a quick guide to action:
-
Configure code coverage tool (e.g.: iOS example)
-
Run each test one by one
2.1. Collect code-coverage report for each test
2.2. Extract involved source files from code-coverage report
-
Create a map (
impact_map.json
), where test names are the keys and arrays of source files are the values, for example:{ "SampleSuite/SampleClass/testMethod_1": [ "path/to/a.swift", "path/to/b.swift", "path/to/c.swift" ], "SampleSuite/SampleClass/testMethod_2": [ "path/to/c.swift", "path/to/d.swift", "path/to/e.swift" ] }
-
Create a script (
check_changes.rb
) that will be responsible for which tests to run:require 'json' impact_map = JSON.parse(File.read('impact_map.json')) response = JSON.parse(`curl -s -H "authorization: Bearer #{ENV['GITHUB_TOKEN']}" -X GET -G #{ENV['PULL_REQUEST']}/files`) impacted_files = response.map { |file| file['filename'] } impact_map.select! do |test, related_source_files| related_source_files.any? { |file| impacted_files.include?(file) } end puts impact_map.keys.join(',')
-
Create CI pipeline that will be responsible for providing any required info to the script above and will run the tests depending on the output:
name: Sample on: [pull_request] jobs: sample: name: Tests runs-on: [macos-latest] steps: - uses: actions/checkout@v2 - name: Runner env: PULL_REQUEST: ${{ github.event.pull_request._links.self.href }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} run: | essential_tests=$(ruby ./.github/scripts/check_changes.rb) fastlane test only_testing:essential_tests
Conclusion
This was just a superficial analysis/rough idea of how we are able to influence on the speed of CI process. Keep it in mind and build your own pipeline as fast as you wish!
See ya (: