CLA vs. DCO for Django contributors

We need input from our Django core contributors and maintainers, on how we ensure all contributions are done in accordance with our licensing and copyright requirements!

Current state

We currently use Contributor License Agreements for Django core. Those are legal contracts between the DSF and contributing individuals or organizations, which help us guarantee that:

  • All contributors have the rights to their contributions (copyright), and transfer this right to the DSF. This allows us to change Django’s license in the future if we wanted to.
  • All contributors are responsible to make sure their contributions are compatible with our license (no patented or copyleft code).

This is good but creates hurdles:

  • We aren’t systematically checking that all code in PRs or patches is from people who have a signed CLA.
  • The CLA process is daunting to individuals, and a legal hurdle for companies (requires legal review)

What we need

We need to revamp this. There are two options:

  1. Retain a CLA process but with more automation. This would mean systematic review that all contributors in PRs and patches have signed the CLA, ideally via a GitHub “CLA Assistant” bot to run on PRs (and equivalent for patches)
  2. Switch to a Developer Certificate of Origin. This is a simpler agreement, that doesn’t require legal review, that is much simpler for contributors to make and for us to check as it integrates with git natively.

DCO in practice

Switching to a DCO would mean:

  1. We (DSF) lose the ability to relicense Django in the future. That isn’t part of a DCO
  2. All commits merged in Django have to be signed by their contributor (-s, Signed-Off-By)

This is simpler but needs to happen for every commit.


Keen to hear people’s thoughts on this!

Hey @thibaudcolas, thanks for raising this.

A couple of old threads from django-developers left me with the impression that the CLA was (likely) not really needed, but good to have for folks that needed to do things properly (e.g. contributing as a company employed engineer, say).

I don’t know if that’s correct. There was a certain amount of “The DSF will be able to say”.

CPython has a bot that checks CLAs are signed before allowing a merge. I suspect the hardest bit of doing something similar would be bootstrapping the list of folks who’s filled one in over the years from the cla@… email. (Jeff has proven get Claude to do this chops, so might be worth an enquiry.)

:+1: ty! I can confirm the CLA is indeed overkill, if we agree we’ll never relicense Django. However from the “I’m not a lawyer” advice we have received so far, it does seem we need a clear statement from contributors that their contributions are done in accordance with our license (“inbound=outbound” concept).

CPython has a bot that checks CLAs are signed before allowing a merge. I suspect the hardest bit of doing something similar would be bootstrapping the list of folks […]

Yep, we would need a check like that. If we retain a CLA, it’s probably a GitHub app integrating with Google Drive / Google Sheets. If it’s DCO, it’s based on git commits only so relatively simple. Could be added to an existing check, or a separate app / check. This would also need to be set up for patches uploaded in Trac.

For the bootstrapping of a new CLA process, for transparency, the records I’m aware of are only from 2022 onwards. I’m sure there are records from before (some stored elsewhere digitally, some paper). We have on file, roughly:

  • 70 Corporate CLAs (not sure how many individuals within)
  • 70 Individual CLAs

I’m not too worried about that bootstrapping, it is a fair bit of work but there are clear opportunities to set up one-off automation. From my side the main concern is achieving the lowest friction possible for contributors and maintainers.


For the sake of completeness, in addition to CLAs and DCOs there’s another option for us to achieve this “inbound=ountbound” agreement with contributors, which is to rely on the GitHub Terms of Service, 6. Contributions Under Repository License.

My understanding is this is suitable but much weaker. This is also only for GitHub users, so would need to be replicated in another way for contributions via patch uploads in Trac.


Oh and for people who want to learn more about this, I would recommend The Legal Side of Open Source guide from GitHub.

1 Like

One concern with a DCO is that Django maintainers, and especially the Fellows, very often amend contributor commits before merging. This is not just minor fixes, but can be adding missing tests, re-ordering changes, or rewriting docs to fit our standards. Under a DCO, as far as I understand, any time a maintainer touches a commit like this they become a co-author and must add their own Signed-off-by line alongside the contributor’s.

We generally do not want to create extra commits for these adjustments, so the only option would be to amend and add multiple sign-offs to the same commit. That is extra process overhead, and easy to get wrong, especially when working on backports or series of related changes.

Given how often we do this kind of surgery, I am not convinced DCO is a good fit for our workflow. It feels like we would spend time fighting the tooling instead of merging patches. Automating the CLA check like Python does seems like a more practical improvement without changing how we work. This option would certainly be the one that has “the lowest friction possible” for maintainers.

Last point: it has been a long time since we processed patches in Trac. For example, I have never done one myself, so perhaps any workflow change should really focus on what we actually do day-to-day on GitHub.

2 Likes

Is there not an option 3 where we do neither?
I do not really understand what we are trying to achieve or prevent from these processes. Given that we haven’t enforced this for 20 years, I don’t see a strong need for it

But that might be a lack of understanding

3 Likes

The BSD license permits incorporation into differently-licensed, even proprietary software. So I’m not there’s a meaningful barrier to “relicensing” even today. If the DSF for some reason wanted to create a differently-licensed thing, use the name “Django” (which the DSF owns the trademark on) to refer to that thing, and incorporate some or all of prior-Django’s code, my understanding is that would require no additional grant of rights or permissions from anyone.

1 Like

I shall research more :smile:

Expanding on what CLAs & DCOs are for, for people who find this interesting.

On what the point is: it’s to achieve a level of defense against possible future intellectual property disputes (copyright, patents). Without a CLA / DCO, we have less defense against that. We still have some defense (operating in good faith, GitHub ToS, etc), just arguably weaker.

As far as the usefulness, the closest flawed analogy I can come up with is it’s like paying for insurance on something very unlikely over years. Your insurance coverage might be amazing but never end up getting used. You might have been sold insurance you don’t really need but it’s tricky to know for sure. Maybe at some point eventually you’ll benefit from the coverage. Maybe you will but could have gotten better value with a different type of coverage.

There are two clear problem scenarios:

  1. Merging patented proprietary code.
  2. Merging GPL or otherwise BSD-incompatible open source code.

For example that could be an ORM contribution from someone working at a database vendor company. Or maybe an open source contributor copy-pastes a template tag implemented in a GPL package.

On one end of the spectrum we get asked nicely to remove the problem code. On the other end we’re in legal proceedings all the way to an expensive trial. This can still happen with a CLA / DCO in place, but those processes help us respond to those types of requests. In theory. I’m not a lawyer.