Improving the contribution experience with Github Actions

:warning: Warning :warning:: wall of text ahead, sorry!

As part of my “why should you vote for me” statement in the technical board elections I wrote that I wanted to improve the Django contribution experience and I specifically called Jenkins out as something I wanted to “improve”. My problems with our current CI setup is this:

  1. The CI configuration is invisible and versioned outside of Django
  2. Jenkins build logs are deleted after 24 hours (so annoying!)
  3. Our build matrix is very small for the number of python/database/os versions we support
  4. Previewing documentation changes is not easy

A lot of technologies have appeared and matured since our Jenkins setup was born that are significantly easier to use, better integrated with developers workflows and offer a generally better developer experience. Below I’m going to outline a proof of concept that I’ve worked on to unify the local development experience with our CI definitions and bring them all into Django itself, with the aim to run exactly the same test setup locally and on the CI. I believe this would greatly improve the general contribution experience, making everyone happier!

tl;dr

I’m proposing we bring django-docker-box into core and make it the “blessed” way to spin up an environment for running the Django test suite. Following on from that, we can use Github Actions with docker-box to greatly expand our database test coverage and unify local tests with the CI tests. We can then also integrate with a free tool like netlify to build deployment previews of all documentation changes that can be viewed in the browser.

You can view a proof of concept of all this here, with the complete set of changes discussed below:

Along with a documentation preview for that PR:

The specifics

Integrating docker-box

Running a full Django test suite has never been super simple. It’s fairly easy to run the sqlite tests, and we bundle a simple settings file for that, but the moment you want to make any more complex changes (DB specific, cache backends, multi-db, etc) you’re on your own. I initially created django-docker-box as a proof of concept for moving off Jenkins and onto Travis CI but I found that simply being able to run the test suite on any supported python/database version without needing to install anything was insanely useful, and so it became it’s own thing.

Django wasn’t quite ready to include this in core, so it currently lives in a separate repository. This is a shame as at the end of the day it’s just a docker-compose.yml file, and running this from a separate project directory increases friction and makes the implementation more complex. Most tooling that integrates with docker-compose (like IDEs) also assume docker-compose.yml lives in the project root rather than driving a project that lives somewhere else.

So, I’d like to move this into core. I think it’s proved that it’s useful and I believe that it offers a really nice initial contributor experience. It would offer a much better experience during sprints (the images are surprisingly small) and it would also be simpler to maintain: changes made to Django have to be synced across to django-docker-box (i.e https://github.com/django/django-docker-box/pull/22), and because it lives in a separate tree it’s not simple to use against non-master versions of Django.

Integrating docker-box with Github actions

If we have docker-box inside Django itself it unlocks a pretty powerful way of running tests: exactly the same tests can run locally and in the CI. No more weird jenkins-specific failures, and the CI definitions can be changed as part of a pull request! Conceptually it’s as simple as just running docker-compose run sqlite/postgres/etc as part of a CI job.

The full-matrix build that used to run as part of the django-docker-box repository has caught a few issues that we’ve missed in our Jenkins tests, often around MariaDB if I recall correctly. An advantage of versioning the CI files in the Django repository is that we can easily adjust the “quick” build matrix (discussed below) if more problems arise with specific databases.

For example we seem to have a test failure on Postgres 11 with PostGis 2.5: https://github.com/orf/django/pull/3/checks?check_run_id=1641111944#step:3:110.

Why github actions and not Jenkins?

We could just run docker-box commands part of the standard Jenkins pipeline we currently have, but I believe that Github actions is a superior product that offers a much richer ecosystem and better integrations with Github itself.

One of the best features is third party workflows. In the POC I’m using a third party action that integrates flake8 and isort errors directly into the PR diff (https://github.com/orf/django/pull/6/files):

We can expand this to automate some nice things: we can welcome new contributors (users with no prior pull requests) and link them to the contribution guide, we can try and ensure that each PR has an associated trac ticket in the title and description, etc etc.

The actual integration

The integration is quite simple. We define a github action that looks like this:

jobs:
  sqlite:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python: [ 3.7, 3.8, 3.9 ]
    name: sqlite-${{ matrix.python }}
    steps:
      - uses: actions/checkout@v2
      - name: Run tests
        uses: ./.github/test-action/
        with:
          app: ${{github.job}}
          python: ${{matrix.python}}
          docker_token: ${{ secrets.DOCKER_TOKEN }}

The test-action basically runs this:

runs:
  using: "composite"
  steps:
    - run: |
        docker-compose pull -q --include-deps ${{ inputs.app }} || true
        docker-compose build --pull ${{ inputs.app }}
        docker-compose run --user=root ${{ inputs.app }} ${{ inputs.test_name }}
        [[ ! -z "${{ inputs.DOCKER_TOKEN }}" ]] && (echo "${{ inputs.DOCKER_TOKEN }}" | docker login -u "tomforbes" --password-stdin && docker-compose push ${{ inputs.app }})
      shell: bash

Specifically the steps are:

  1. We pull the latest docker image and all associated dependencies (databases etc)
  2. We run docker-compose build to include any changes. If the cache is fresh, and no new Python versions are released, this should be a no-op
  3. We then run the test suite (in the case above, sqlite).
  4. We then push the built docker image to the registry, to be used by future builds. This should only happen on master builds for obvious reasons.

MacOS and Windows

Windows and MacOS builds don’t support Docker, so they just run with a series of steps like so:

  non_docker_test:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ windows-latest, macos-latest ]
        python: [ 3.8 ]
    name: sqlite-${{ matrix.python }}-${{ matrix.os }}
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python }}
      - run: brew install libmemcached
        if: ${{ matrix.os == 'macos-latest' }}
      - run: pip install -q wheel pip
      - run: pip install -q -r tests/requirements/py3.txt -e .
      - run: python tests/runtests.py -v1

Slow vs quick builds

The full build matrix includes over 60 jobs which might be excessive. I think we should add ~20 or so tests that are run on every change, but include the ability to tag a PR as full-build. You can do this by commenting @buildbot full-build, which will automatically add the tag to the PR, or just via the interface. Any push to a MR that includes the tag full-build will run the full build matrix of 60 jobs. We could even make it more specific by having commands like @buildbot full-postgres.

This would all be automated via a Github action, which lets you respond to events like comments and tags in pull requests.

Automated documentation builds

Adding the netlify integration to the repository and adding a netlify.toml file to the root of the repository gives us free deploy previews of our docs:

[build]
publish = "_build/html/"
command = "make html"
base    = "docs/"

[context.deploy-preview.environment]
PYTHON_VERSION = "3.7"

Each build takes about 3 minutes and includes a bunch of automated caching.

Problems

Oracle

Yeah. Oracle. Urgh. The major issue with supporting Oracle is that the docker image requires authentication to pull and this poses big issues around security (sensitive secrets are not injected into non-master builds). Even if we could work around this the startup time for an oracle container is absolutely ridiculous (like, hours).

If we want to support Oracle builds in the CI then I think we need to create a custom django-oracle image that inherits from the standard upstream Oracle image. The image would need to pre-initialize the database to avoid long startup times and be hosted on a private Docker hub repository that is only available to the Django CI runners (Oracle do not like you distributing the image publicly). However, for now, we could just keep Jenkins around for the Oracle integration :frowning:.

Permissions

Github actions require containers to run as root which messes with 3 permission-based tests. I’ve currently just added @unittest.skipIf(os.getuid() == 0) to them, but something better needs to be worked out.

Maxmind GeoIP database

Maxmind recently put all their previously public (and free) GeoIP databases behind a service that requires authentication. This is a problem similar to the Oracle one above - it’s going to be hard to work out how to include these in the docker image in an automated way.

Final thoughts

I really think this idea has legs, and the PoC was surprisingly easy to get mostly working. If feedback here is favourable then I can spend some more time working on this and an associated DEP.

6 Likes

This looks amazing, gotta find some time to look through that.

Given that Oracle has been a PITA in the past I think it is more than okay to simply skip it for now and keep it on jenkins (which we can scale down massively then). Let’s get the 90% done and then worry about the rest otherwise we will never get this merged.

1 Like

Thank you! I agree that we should hit the 90% mark first and see if we can improve it going forward.

One other approach that I forgot to mention in my initial post is just adopting Actions without docker-box. The docker approach definitely adds complexity, and it would be simpler to just execute the relevant pip/apt/whatever install commands and then run runtests.py. However consolidating all these setup steps in a single place (the Dockerfile) gives some nice benefits, and the setup is non-trivial (lots of packages to install etc).

Another benefit that I didn’t really highlight is MacOS testing, and while I have no numbers to back this up I’d expect a non-trivial amount of our users to be developing on that platform. We also get to test it on a newer version of Windows - I think Jenkins runs on Windows 7, right?

I like using the docker containers because that gives local reproducibility.

:+1:

Can we build the branches on a schedule? ie a weekly full-build to catch errors before releases?

I’m totally onboard with integrating django-docker-box, moving stuff to GitHub Actions, etc. Great suggestions @orf !

It’s possible to run containers directly with GitHub Actions in its services config, e.g. on django-mysql. Does this idea have any legs, perhaps making tests faster? I do recognize it would mean duplicating stuff from the docker-compose.yml.

We’re now using pre-commit for linting (PR), and if we activate the pre-commit.ci service (perhaps when it leaves alpha), it can push auto-fixes for isort etc.

Our docs are on readthedocs, and they provide a PR build which needs a readthedocs file: https://github.com/adamchainz/django-mysql/blob/master/.readthedocs.yml . So we don’t need netlify.

Sure - actions is very flexible and we can define custom actions that run on a schedule. We can run a daily full master build at midnight, which has the advantage of keeping the docker cache warm.

I think it would actually be slower because of the dependencies that need to be installed. The Dockerfile is here and it takes about 3/4 minutes to build. By utilizing Docker as a cache we can reduce the overhead of this to basically 0 in the happy path (hot cache, no dependency changes). Docker on Linux has very little (effectively no) overhead, so I would be very surprised if the tests themselves run faster outside of it.

However it’s definitely worth benchmarking - I’ll experiment with this next time I work on this

This would be great - automating these kinds of manual, annoying things is pretty important. Black can definitely help here, but at least being able to go fix these linting errors and then merge in a single step would be cool.

:partying_face: awesome! I wasn’t aware that they provide this, and it makes things simpler. Netlify has a limit to the number of build minutes you can use as well.

Yes duplication is my main worry. Btw, how cool would it be if github could just take a docker-compose file and transform that to services? :smiley:

Where does RTD host those builds? Do you have an example in some repo for that? Either way doesn’t have to be netlify, either is fine I guess.

Right, it’s “just YAML” :sweat_smile:

On RTD. Hit “view details” and find the build on this recent django-mysql PR and you can find the link to https://django-mysql--746.org.readthedocs.build/en/746/ . This PR was just this morning - I don’t know when the temporary build times out.

1 Like

You do laugh now, but if you compare About service containers - GitHub Docs with the example in Docker Compose overview | Docker Docs I’d say that for the MVP they already got a matching syntax :wink:

It’s still there and I guess that would be good enough. This is lovely!

Sure, my laughing is more at YAML itself :slight_smile:

Thanks for all the feedback here!

In the interest of moving this onto something more concrete and away from a large POC I’m proposing we take the following course of action:

  1. We merge a Github Actions MVP with the least amount of changes and monitor the general experience for a while
  2. If the feeling is positive we begin adding docker-box to the project to run the basic sqlite tests across all supported Python versions, and continue to monitor
  3. If all goes well we expand the test suite to match what is currently running in Jenkins
  4. We then add support for a full-matrix build, with an associated PR command/label, and work what versions we want to test for every build and what ones we want to reserve for full builds.

I’d love some feedback from the fellows on this course, as well as any indication from anyone about how we structure this. Would this require a DEP? If so I’d argue that a quick MVP that can be easily backed out of would be good to test first, to provide some more justification for the DEP and to avoid too much “upfront design”.

In this vein I’ve created an initial PR to add actions for MacOS, Windows and flake8/isort. These require no real changes to the project and are nicely isolated, and gives us some immediate benefits like MacOS testing.

Regarding point 4, I feel that we should:

  1. Run sqlite tests under every Python version
  2. Pick an appropriate Python version and run every database type/version against that

The feeling for this is that python-agnostic/database-specific failures are more likely to arise than database+python specific failures.

Sounds good to me. Given that Django use pre-commit via https://github.com/django/django/blob/master/.pre-commit-config.yaml; would it be possible to use GitHub - pre-commit/action: a GitHub action to run `pre-commit` instead of manually configuring flake8 & isort? (To be fair I am not sure what pre-commit can do, I haven’t used it yet).

This certainly sounds like the least friction plan and even if we were to decide against it I think offloading some things to github might not be the worst thing to do.

paging @carltongibson and @felixxm to the action :wink:

Imo no; I think I can speak for the ops team if I say: The less infra we need to have in the long run the better. And if a vendor lock-in argument comes up I’d argue that executing “git rm -r .github” is easy enough (we do not have to delete out jenkins configs after all)

Yes this is a good thought.

That action can work, but I think it would also be nice to try pre-commit.ci, the SaaS also created by Anthony Sottile. It says it’s alpha but it’s working for me on my 27 open source projects.

I agree - there are enough alternative CI’s taking the same-ish approach that migration shoud never be too hard.

I’m somewhat sceptical - wouldn’t something else pushing to your branch, potentially after every commit, get really annoying? I also forsee potential issues with CI resources - we could get to a situation where new PRs are effectively run twice in quick succession. I’d much rather the ability to see the linting errors, if any, and have an automated way to fix them up when needed. i.e @pre-commit fixup as a comment.

However maybe it’s worth a go? I’d still like to keep the flake8 and isort actions we currently have though?

:ok_hand: perfect! Full stream ahead. I’ve added you both as reviewers to the PR.

Oh right, I did not realize that the pre-commit service would push new conmits; I thought it would just run in check mode (even if that makes no sense, not sure where from I got the idea :))

I just work here. :smiley:

I’m on the record as thinking hosted CI is just about the best thing since sliced bread. I used to run Jenkins, and didn’t like it, at all.

So generally +1 from me.

The bottom line is that @felixxm ends up maintaining Jenkins, so I defer to him.

I have one :grimacing:

… make it the “blessed” way to spin up an environment for running the Django test suite.

Docker is great, yay Docker. But it’s often unnecessarily complicated and resource intensive. I don’t think it’s easier for most Python devs than (the current) ”Make a venv, install the requirements, runtests.py. I’d like to keep that as the First entry point.

I grant that once you want multiple DBs, using Docker is a good idea.

Don’t shoot me :smiley:

1 Like

But that only covers SQLite, if you want to test PostgreSQL as well you are in another world of pain [as you noted] :slight_smile: But yes, I do not care much about what is listed first. I just think that we need docker if we ever want to cover more than one database without overengineering jenkins.

I’m definitely not advocating for Docker to become the only way to run the tests and I prefer running plain runtests.py in a lot of situations. I’m advocating for it to become a fully supported way to run tests that need anything other than sqlite, complete with bundled settings files and documentation.

I had a play just now with using tags to control builds. I added a number of Full- tags to my fork and split the jobs into “Quick” and “Full” variants. The quick builds only run on python 3.7 whereas the full builds run on 3.8 and 3.9. For the full builds I just had to add this to get them to conditionally run:

if: contains(github.event.pull_request.labels.*.name, 'Full-Postgres') || contains(github.event.pull_request.labels.*.name, 'Full-Build')

Which was pretty simple. The end result was this nice build matrix:

I like using tags for this because they persist across pushes rather than just one-off builds like our current test on oracle comment. You can automate adding tags when a comment is added:

Or we can use something like this action to base it on comments instead of triggers:

Another interesting integration I came across is the official “labeller” action:

You define some labels and some path globs and it will label PRs if those files change. It might be interesting to apply labels to only documentation changes, or even by Django component?

Hi Tom, I took a step back and thought about security implications. Not a problem now, but something we should consider in the future. Ie your docker push requires a secret. Now if we add that to the Django repo, who has access to it? Can a malicious actor open a PR and exfiltrate the secret via curl (or simply echo $SECRET)?

This should not block us from adding flake8/linting etc; but is something to consider when we expand.

EDIT:// Instead of looking at the docs I just clicked through the UI now and Github has a few helpful explanations there:


TLDR: We should be good.

If the Docker secret is needed but is “not passed to workflows that are triggered by a pull request from a fork” does this mean that nearly all pull requests will fail?

See codecov issues which had this problem for a while until an alternative solution was found by GitHub changing their API (and therefore not using secrets at all).