As part of GSoC 2025 under the Django Software Foundation, I’m working on the project titled “Automate Processes within Django Contribution Workflow.” The primary goal is to reduce manual overhead for maintainers and improve the contributor experience through automation and better CI integration.
I’m reaching out to get feedback from core contributors, mentors, and fellows on what areas we should prioritize over the next few weeks. There are multiple directions this project could take, and I want to make sure the work aligns well with the community’s expectations and current pain points.
We already run tests in GitHub actions for SQLite, would be good to run this with coverage and to have report based on this git diff (perhaps using something like diff-cover · PyPI ) to have a warning message about missed lines. Missed lines should only be a warning because some lines would only be run with certain databases and would not expect to be covered by SQLite
release note docs
Tickets that are classed as a “New Feature” should have a release note in the upcoming release (e.g. docs/releases/6.0.txt), we would also expect to see some other docs changes that includes either a .. versionadded:: 6.0 or .. versionadded:: 6.0 note
Tickets that are release blocker bug fixes and not for “dev” should have a release note in the next version (e.g a release blocker for 5.2 should have a release note in docs/releases/5.2.6.txt)
I think the test coverage report saves a review the time of running tests with coverage and checking their code changes in the report.
I think the release note “hints” may not be that difficult for a reviewer to tell the PR author but the PR author could get this feedback very quickly that they are missing the release note if this check is in CI.
I was looking at what other projects (cPython) do in terms of automation. I noticed they have bots associated with backporting of patches. “Miss Islington” to do the cherry pick / backport itself and “Bedevere” to manage the labels on GitHub.
While I’m not sure if it’s out of scope of this years project, or if the Fellows would even find it useful, I thought it worth mentioning as there’s prior art available which could be of inspiration.
In a similar vein, in the pylint project we use a backport github action, and it “just works”, even providing easy-to-follow git commands for resolving merge conflicts manually.
I’ve implemented an initial version of running SQLite tests with coverage and generating a diff-based report using diff-cover.
At this stage, the setup adds coverage collection to the existing workflow and outputs warnings for missed lines instead of failing the job.
Could you please review this approach and let me know if I’m moving in the right direction before I refine it further?
the update sounds like a really valuable initiative. A few thoughts from experience working with contribution-heavy projects:
Prioritization: Automations that reduce repetitive maintainer workload (like checking for missing tests or docs updates) should come first, since they offer immediate ROI and are relatively low-risk.
Label management (e.g. need-tests, need-docs, good-first-issue) is a natural early win.
Areas to avoid over-automation
Anything that requires judgment about code quality or architecture — e.g., deciding whether a PR should be accepted or rejected — should stay human-driven.
Be cautious about auto-closing PRs too aggressively. A “soft warning” (comment + label) might be better than closing outright, so contributors don’t feel discouraged.
Ideas for later phases: Automated reminders on stale PRs (e.g., ping after 30 days of inactivity).
Automated checks for issue–PR linkage (reminding authors to reference related tickets).
Feedback loops: it might be useful to make automation decisions configurable via repo settings or labels, so maintainers can fine-tune behavior over time.
I worked on something similar at Modsen in an internal dev-tools project, and the biggest lesson was: start simple, iterate, and always have a “human override” so maintainers don’t feel locked in by automation. hope it will be useful
Late to the party here but the coverage reporting in PR has been attempted before and the efforts stalled due the difficulty of combining coverage reports from different suite runs.
Some part of the code is only covered by tests run on Postgres, MySQL, Oracle or a particular Python version for example so there needs to be a coordinated job that collects all of the .coverage data artifacts and then combine them otherwise the resulting coverage report will be lacking or improperly reporting that some areas are not covered (e.g. if we only use the SQLite test run and Postgres only changes are introduced).
This is especially difficult because some tests are run on Jenkins (the vast majority) and others on Github.
The current PR only posts a comment of the coverage report and does not fail when there are missing lines of coverage (as the test is limited). In the comment, it says:
Note: Missing lines are warnings only. Some lines may not be covered by SQLite tests as they are database-specific.
Beyond database versions, we may have Python version specific code etc. It certainly isn’t a complete report.
We already have a Jenkins ci coverage report (https://djangoci.com/job/django-coverage/HTML_20Coverage_20Report/) which is ran daily against main.
I believe this adds value despite being incomplete (I think this is also only ran on SQLite). The idea with the GitHub action is to have this information automatically available on PRs.
That being said, I think having this CI job (perhaps including the Jenkins coverage job) documented in our contributing docs with it’s limitations might be wise. Somewhere within our docs around reviewing PRs (e.g. Submitting contributions | Django documentation | Django). Then we have something we can link to with more information if folks are finding the report confusing (or the comment itself can link to it). We may also mention limitations around coverage and that the report says those lines are “covered” doesn’t always mean they are well tested (this should be checked in review).
In short, for PRs which don’t impact the ORM, I feel this would add value in most cases.
So I am +1 on us having this limited coverage report posted on PRs
I now realize that what’s proposed here is less ambitious than what was previously attempted and I agree that it can still be valuable to contributors to have the details in-lined their pull request contribution if the comments come with an admonition.
Sorry for the jumping the gun here, maybe I’ve just been around for too long but it wouldn’t be the first time I see very well intentioned efforts follow the exact same path towards fixing long standing problems in ignorance of previous attempts and I wanted to make sure it wasn’t the case here as I couldn’t find any references to previous efforts in the linked documents.