Improving the contribution experience with Github Actions

We wouldn’t push the containers from pull requests anways, so this would be fine.

But yes, things like test reports uploads etc will all not have access to the API.

Hi @orf, fantastic work!

About the issues with Oracle, maybe we can try unofficial images? I don’t know if we’re allowed to do so, but given that django-docker-box also uses an unofficial image, maybe it’s worth a try.

I personally use this one for my django-jsonfield-backport package, and it takes less than 10 minutes to setup. However, if you want to run the tests with a custom user (not the default one included in the image), you’ll have to mount an SQL script to do that. Thus, it can’t be run as a service container. Here’s how I did it, for example.

I hope it’s useful!

Hi all,

There was some discussion on the JavaScript PR about it failing (randomly). With gha you can re-run the tests, currently I’ve seen folk close and re-open PRS to get Jenkins to re-run the test suite.

I’m not sure how the permissions work, but should some folk in addition to the mergers have the ability. I see this as low risk but could help PRs progress? This should probably be aligned to adding of badges for wider test runs if/when we get there.

I raised concerns about vendor lock-in originally in the mailing list.

But, given the current proposal of integrating docker-box, I don’t see it - at least for Linux. The way I see it is that Github Actions would be more like a backend for it, just like Jenkins, and it would add easy reproducibility of the tests if you’re offline.

So, I’m +1 for this.

I also share @carltongibson concerns about docker requirements and complexity, but as @orf mentioned, it won’t be the only way, and if “Make a venv,…, runtests.py” still is the first entry point, I see no problem.

As for Win, MacOS, for what I understood, docker-box does not support Win or MacOS, so it has to be “direct” GA or Jenkins.

I don’t know if including the Jenkins conf would be enough for a User to reproduce testing on them being offline or withouth the CI. If so, or it is “reasonably good”, this could be a way.

Oracle can follow this way, if no docker-box solution is bundled (as suggested by sage)

Thank you for the image suggestion! I’ll try integrating that later, it sounds like a good idea. 10 minutes of setup time isn’t great but it’s far better than the official images.

Can you clarify? Only specific users have access to retry jobs?

Even without docker box I don’t see any problem with this around lock-in for running the test suite on Actions. It’s basically a glorified “YAML bash script” with the commands clearly laid out. I see some interesting benefits to adopting third party actions to automate some repository “stuff”, and while an argument could be made about lock-in there I don’t think it holds much water: we’re on Github. Leaving improvements to our Github experience (via actions) on the table seems silly to me.

Well, one thing is the ability to easily reproduce the test suite “offline” in order to PR better, as long as that is possible, the rest is “glitter” - much wanted perhaps, but “secondary” to me :slight_smile:

A bit more digging suggests you need write access to re-run jobs which seems a bit of a shame. Here’s what I see when I comparing Django to another repo where I have write access.

That’s the kind of thing GH are quite responsive too… Wondering if we can get that adjusted to the Triage Role… — I will ask.

Awesome work on this @orf! There’s really a lot of possibilities that opens up, looking forward to it. It’ll also be good to expand the test matrix, I’ve found a couple of bugs by running django-docker-box on the latest PostgreSQL versions. And as described earlier django-docker-box currently very easily lag behind changes made to supported versions in the django-repo. A recent example is the drop of support for Python 3.6 and 3.7 which broke django-docker-box. By making it a first class citizen these issues will go away.

Regarding rerunning CI by closing and re-opening, it shouldn’t be a problem since the reopened is one of the default events on pull_request. If the need to rerun CI by people who isn’t the author or doesn’t have write-permissions is a common thing (is it?) it could be solved by something like the previously described commands @buildbot rerun. The action backing these commands could check against a file in the master-branch of the main repo of triage usernames. IIRC ansible does something similar to this to give limited CI permissions to people working on specific modules.

Regarding the limitations of GITHUB_TOKEN on actions triggered from forks I’ve had some success with separating stuff like handling publishing of coverage into separate workflows which triggers on the main repo by the event workflow_run. It works by letting the workflow which is triggered by the PR (and is running with lowered permissions) to upload artifacts of for example a coverage.xml file produced by a test run. The separate workflow which runs on workflow_run type completed will then download the artifact and use secrets from the origin repository to for example post a coverage report as a comment in the PR. See this blogpost for more details. And this repo for an example implementation.

I have also been playing with such jobs. The main issue as I see it is the increased complexity. But I guess if we limit our-self to the most useful things, it might be a workable approach. We just should be careful and slowly add things as opposed to the whole kitchen sink.

This might also mean that we pull those external actions into one of our own repos, to limit the risks of getting code that we don’t want…

In an ideal world we (as in Django team) could configure which access levels GITHUB_TOKEN would have. Ie we’d never want it to commit to the repo at all (for now), disabling write access to the repo for all actions would already help greatly.

So we’re slowly adding Github actions, but currently around the periphery (linting, JS tests). I’ve got the rare mix of motivation and time in the coming weeks so I’d really like to try and push this forward so it doesn’t languish - @felixxm you said you had some reservations, do those still exist and can we discuss them? I’d like to get the sqlite tests on MacOS and Windows ready for checkin (perhaps without removing the Jenkins builds?) but I’m not sure what is still blocking them. If I can get some areas for improvement or things you’re concerned about I can work on that, even if you’re not in a position to review and merge them right now.

It might also be good to discuss other, potentially easier things to merge using Actions. I’ve got three ideas that could be useful to new contributors and us:

  1. A comment on the merge request for first time contributors, saying “thanks for contributing” and some information trac or a link the contribution guide? Maybe there are some common mistakes that new contributors often make that can be helped with this? I wonder if you have any input here on the things we might include @carltongibson?

  2. An action to make it easier to link trac → github? When a PR is made, if it lacks trac info (i.e a ticket ID in the title) we can prompt the user to modify the title to be in the correct format. Once done we could automatically modify the trac ticket to include a link to the PR and set the “has patch” flag to True? If there is no associated ticket then the change has to be a trivial “quick fix”, and in that case we can tag it as such for other contributors to review quickly. When a MR is closed we can also remove the has_patch attribute from the trac ticket.

  3. A scheduled action to close merge requests that are currently open but where the trac ticket is closed? I imagine there are not that many of them but this could be expanded later with other “cleanup” tasks.

Hey @orf sorry for the slow reply. I was on holiday and it slipped through the net. :stuck_out_tongue_closed_eyes:

I’m keen on all the ideas here. I’d like to have had time over the last couple of years to investigate GitHub bots and stuff. Actions looks to make that easier so generally :+1:

I think the first bootstrap is the blocker. (Easier now we’re up and running). Once there’s an example in place it’s generally easy to copy.

Maybe a Welcome to a first time contribution, with maybe a link to the Patch Review Checklist might be a nice start :thinking:

Thanks for taking the initiative here. These things always need a champion.:trophy:

C.

1 Like

Thanks Carlton, I hope you enjoyed your holiday!

I’ve got an initial PR here: Add a message to new contributors when they open a PR by orf · Pull Request #14090 · django/django · GitHub. I can’t test this on my fork as I’m not a first time contributor, but the action just checks if they have authored a PR before and if not then it adds a comment as the “Github Actions Bot”. If we have a bot user we would like to use instead then we can switch out the token and use that.

Also, maybe slightly off-topic, but shouldn’t we remove the CLA section from the checklist? I’ve never filled it in and it’s not a thing that seems to be enforced. It also might be offputting.

If we wanted to enforce this going forward then there are a few services we could use to handle this - CLA-assistant from SAP is one I’ve used before.

And @ngnpope has a good solution for Maxmind databases (Fix Maxmind GeoIP downloads · Issue #14 · django/django-docker-box · GitHub). I can’t believe I never thought about a test database before.

We could potentially add a submodule for these test files or vendor them directly? They are only ~20kb a pop.

(Sorry for the spam!)

I brought this up a while ago https://groups.google.com/g/django-developers/c/etb1Ye_IJjE/m/umhPQ9IsEAAJ — it seemed to me we either need to do it properly, or not at all.

The answer was roughly, it’s there for folks who want to do the right thing

In the end I settled for moving it to the bottom of the First Steps list.

If a TB member wanted to lobby for change that might make a difference… :grinning: