RFC: Prioritizing Tasks for “Automate Processes within Django Contribution Workflow” (GSoC 2025)

Hi everyone,

As part of GSoC 2025 under the Django Software Foundation, I’m working on the project titled “Automate Processes within Django Contribution Workflow.” The primary goal is to reduce manual overhead for maintainers and improve the contributor experience through automation and better CI integration.

I’m reaching out to get feedback from core contributors, mentors, and fellows on what areas we should prioritize over the next few weeks. There are multiple directions this project could take, and I want to make sure the work aligns well with the community’s expectations and current pain points.


Project Background


Initial Work


Discussion Goal

I’d love your input on:

  • Which automation features should be prioritized first?
  • Are there areas that should not be automated due to the nuance or context they require?

Any feedback, even small notes, would be incredibly helpful to help scope the next steps of the project.
Thanks in advance!

Saurabh

Personally I would prioritize:

  • test coverage report
    • We already run tests in GitHub actions for SQLite, would be good to run this with coverage and to have report based on this git diff (perhaps using something like diff-cover · PyPI ) to have a warning message about missed lines. Missed lines should only be a warning because some lines would only be run with certain databases and would not expect to be covered by SQLite
  • release note docs
    • Tickets that are classed as a “New Feature” should have a release note in the upcoming release (e.g. docs/releases/6.0.txt), we would also expect to see some other docs changes that includes either a .. versionadded:: 6.0 or .. versionadded:: 6.0 note
    • Tickets that are release blocker bug fixes and not for “dev” should have a release note in the next version (e.g a release blocker for 5.2 should have a release note in docs/releases/5.2.6.txt)

I think the test coverage report saves a review the time of running tests with coverage and checking their code changes in the report.
I think the release note “hints” may not be that difficult for a reviewer to tell the PR author but the PR author could get this feedback very quickly that they are missing the release note if this check is in CI.

1 Like

I was looking at what other projects (cPython) do in terms of automation. I noticed they have bots associated with backporting of patches. “Miss Islington” to do the cherry pick / backport itself and “Bedevere” to manage the labels on GitHub.

While I’m not sure if it’s out of scope of this years project, or if the Fellows would even find it useful, I thought it worth mentioning as there’s prior art available which could be of inspiration.

2 Likes

Thanks @sarahboyce and @smithdc1 for the points.
I’ll start with the diff-coverpoint.

In a similar vein, in the pylint project we use a backport github action, and it “just works”, even providing easy-to-follow git commands for resolving merge conflicts manually.

2 Likes

I’ve implemented an initial version of running SQLite tests with coverage and generating a diff-based report using diff-cover.
At this stage, the setup adds coverage collection to the existing workflow and outputs warnings for missed lines instead of failing the job.
Could you please review this approach and let me know if I’m moving in the right direction before I refine it further?

PR - extended test.yml to include coverage by 0saurabh0 · Pull Request #6 · 0saurabh0/django · GitHub

Thanks!

Hi Saurabh,

the update sounds like a really valuable initiative. A few thoughts from experience working with contribution-heavy projects:

Prioritization: Automations that reduce repetitive maintainer workload (like checking for missing tests or docs updates) should come first, since they offer immediate ROI and are relatively low-risk.

Label management (e.g. need-tests, need-docs, good-first-issue) is a natural early win.

Areas to avoid over-automation

Anything that requires judgment about code quality or architecture — e.g., deciding whether a PR should be accepted or rejected — should stay human-driven.

Be cautious about auto-closing PRs too aggressively. A “soft warning” (comment + label) might be better than closing outright, so contributors don’t feel discouraged.

Ideas for later phases: Automated reminders on stale PRs (e.g., ping after 30 days of inactivity).

Automated checks for issue–PR linkage (reminding authors to reference related tickets).

Lightweight contributor experience tooling (like ensuring PRs follow commit message conventions).

Feedback loops: it might be useful to make automation decisions configurable via repo settings or labels, so maintainers can fine-tune behavior over time.

I worked on something similar at Modsen in an internal dev-tools project, and the biggest lesson was: start simple, iterate, and always have a “human override” so maintainers don’t feel locked in by automation.
hope it will be useful

Late to the party here but the coverage reporting in PR has been attempted before and the efforts stalled due the difficulty of combining coverage reports from different suite runs.

Some part of the code is only covered by tests run on Postgres, MySQL, Oracle or a particular Python version for example so there needs to be a coordinated job that collects all of the .coverage data artifacts and ​then combine them otherwise the resulting coverage report will be lacking or improperly reporting that some areas are not covered (e.g. if we only use the SQLite test run and Postgres only changes are introduced).

This is especially difficult because some tests are run on Jenkins (the vast majority) and others on Github.