Change send_mail() and EmailMessage() to kwargs-only?

I’d like to change send_mail(), the EmailMessage constructor, and some other django.core.mail APIs to require keyword args for their more-obscure parameters. The goal is to simplify future additions/changes/removals of parameters, while allowing params to be listed in logical order in both code and docs. Seeking feedback before opening a ticket.

[This came up in reviewing the PR to add multiple EMAIL_PROVIDERS. In the updated mail APIs, the developer had originally (and reasonably, IMO) placed new provider options alongside existing connection params. But since that would alter documented parameter ordering, it had the potential to break existing code relying on posargs. And while I’ve suggested using kwargs-only for all new params to those functions, it would be helpful to retrofit the existing params now so we don’t run into this again later.]

In practice, I’d expect that the vast majority of code calling send_mail() is already using keywords for anything past the first four args. This proposal would make that mandatory.

Specifically, I’d like to change the current:

def send_mail(
    subject, message, from_email, recipient_list,
    fail_silently=False, auth_user=None, auth_password=None,
    connection=None, html_message=None):

to (after a deprecation period):

def send_mail(
    # these parameters can be positional or keyword:
    subject, message, from_email, recipient_list,
    *,  # all other parameters are keyword-only:
    fail_silently=False, auth_user=None, auth_password=None,
    connection=None, html_message=None):

The first four params would not be affected, and would continue to allow either positional or keyword args even after the change.

Like any change to documented APIs, this would go through deprecation. During the deprecation period, posargs would continue to work for fail_silently and on, but would issue a deprecation warning.

Similarly, send_mass_mail(), mail_admins(), mail_managers(), and get_connection() would all be changed to kwargs-only starting at their fail_silently params. (Depending on some design decisions in the EMAIL_PROVIDERS work, we might want get_connection() to be entirely kwargs-only, but that’s a separate discussion.)

For the EmailMessage and EmailMultiAlternatives constructors, the first four subject, body, from_email and to params would continue to support both positional and keyword args, but the remaining params starting with bcc would become keyword only after deprecation.

1 Like

I tend to have more appetite to carefully break things than most, but I think this is a great idea. I suspect that many of these would have been keyword-only arguments if they had existed at the time these interfaces were designed. Particularly boolean parameters are much clearer when they are called with keywords, so when in doubt I prefer keyword arguments in APIs I design.

Even if some of these would be acceptable positional arguments if they were done fresh (I’m not sure if that’s true), changing the order is highly breaking and significantly more difficult to communicate, so I would not permit adding any additional positional arguments if we make any arguments keyword only.

2 Likes

It seems like Django users agree with you, at least when calling APIs with boolean arguments. In a GitHub code search for calls to send_mail(), all the examples I could find with more than four args switched to keywords at the fail_silently boolean. But only a handful used keywords for the earlier args—probably recipient_list is a bit long to type. (I looked through about five pages of results.) A search for EmailMessage/EmailMultiAlternatives seemed similar: up to four posargs were common, but keywords after that.

FWIW, the only place in Django’s own code/tests that would be affected is the super call in the EmailMultiAlternatives constructor, which passes through the long list of positional args. All other code and tests seem to use keywords.

Along those lines, I’d guess the biggest impact from this change would be on EmailMessage subclasses, particularly in third-party libraries. E.g., django-mail-factory has a similar super constructor call using posargs, so would get deprecation warnings. But even that seems limited. Most of the examples I looked at call super with keyword args (or the generic **kwargs), so wouldn’t be affected. E.g., in django-fancymail.

For what it’s worth, there are comments saying that the APIs for send_mail() and send_mass_mail() are frozen. This was discussed (and the comment violated) in #20817 (adding the html_message parameter twelves years ago in Django 1.7). There have been no argument added since.

A recent ticket from a few days ago: #36132 (Add **kwargs to send_mail() and send_mass_mail() functions) – Django

As for your kwargs-only proposal, I’d be inclined to do it without deprecation. As you analyzed, the chance of someone passing positional arguments tot these functions is low, breakages will give a clear error message, and breakages can easily be fixed in a backward-compatible way.

Yeah, the comments are about to be violated again as part of #35514 (Dictionary based EMAIL_PROVIDERS settings) – Django, with a new provider parameter for every mail function that takes a connection. (Even if this current proposal doesn’t go forward, I’m trying to encourage that at least any new params be added keyword-only.)

That’d certainly be easier. :grinning:

One possible problem is third-party libraries (like the django-mail-factory example) that use entirely posargs. Any projects using those libraries would be unable to upgrade until the libraries were updated. [I have no idea if django-mail-factory is widely used, or even maintained—it just came up in the first page of the code search. But finding a posargs-only example that easily makes me suspect the pattern isn’t totally unusual. They probably just copied code from EmailMultiAlternatives.__init__().]

I see what you mean about EmailMultiAlternatives. Perhaps a deprecation there would be appropriate. I’m not sure how cumbersome it would be. (I know of one case: Fixed #34355 -- Deprecated passing positional arguments to BaseConstr… · django/django@ad18a01 · GitHub). My suggestion was about the functions.

I have a feeling there are many APIs that would be improved a bit with keyword-only arguments. If we start going down this route without some overall strategy it may feel like death by a thousand cuts.

1 Like

I’ve worked up a little decorator that remaps posargs to kwargs and issues a deprecation warning. It works on functions and class methods (including constructors). The actual remapping is pretty straightforward; most of the complexity is constructing the warning message.

I’m experimenting with two versions of it. The first inspects the function signature to figure out the parameters that are changing. It’s less typing, but requires keeping the keyword params in their original order during the deprecation period:

# Option A
@deprecate_posargs(RemovedInDjango70Warning)
def send_mail(
    # Do not change any positional params or the first five keyword-only params
    # (including their order) before Django 7.0. (RemovedInDjango70Warning)
    subject, message, from_email, recipient_list,
    *,
    fail_silently=False, auth_user=None, auth_password=None,
    connection=None, html_message=None,
    # Add any new params below here. (RemovedInDjango70Warning)
):

The second approach requires an explicit list of params that are changing, but isn’t impacted by changes to keyword param order (accidental or intentional) during the deprecation period:

# Option B
@deprecate_posargs(
    RemovedInDjango70Warning,
    start_pos=4,
    moved=["fail_silently", "auth_user", "auth_password", "connection", "html_message"]
)
def send_mail(
    subject, message, from_email, recipient_list,
    *,
    # (new keyword-only params can be added anywhere, at any time)
    fail_silently=False, auth_user=None, auth_password=None,
    connection=None, html_message=None,
):

[A few other tradeoffs: Option A forbids variable *args params; Option B can support them. (Both allow **kwargs params.) Option A will happily allow any newly-added keyword-only params to be passed as posargs during the deprecation, albeit with a warning; Option B will reject those excess posargs. And Option B can enforce that no positional params are added/removed during the deprecation, per @ryanhiebert’s concern.]

Also, in investigating third-party EmailMessage subclasses yesterday, I noticed Djblets is using a @deprecate_non_keyword_only_args decorator from the housekeeping · PyPI package. So maybe just borrow that instead.

I’m all for switching to keyword-only for this API and others.

Here’s another case where we did this: Ticket #35060 deprecated positional arguments for Model.save(). I found two issues in the dual pathway code there: Ticket #35554 and Ticket #35561.

I haven’t reviewed your decorator @medmunds, but it seems quite extensive, so maybe you have avoided the same issues. One concern is the overhead for shuffling positional arguments into keyword ones does slow down function calls. Another concern is that having a “fake” signature during the deprecation period makes it hard to use IDE help.

Given the overhead, potential issues, and low positional argument usage, I’m in favour of just changing the signatures in one version. Since you already found the example in django-mail-factory, maybe you could make a PR there as part of the work?

Great! I’ll wait to see if there’s more feedback, but seems like we’re leaning toward kwargs-only without deprecation.

I mean, sure, since I already know about that one. But I’m not planning to go hunting.

In case the decorator does come back up, thanks for the pointers @adamchainz. I was already handling the second problem, but I’d missed the first one. (The gist has tests below the implementation.)

I wouldn’t be too concerned about the added overhead for the email functions—those are already pretty heavyweight operations. But I can see the concern if that decorator got sprinkled throughout more performance-critical code.

FWIW, PyCharm’s autocompletion and argument annotations don’t seem fazed by the decorator.

Opened #36163 (Change mail APIs to (mostly) keyword-only parameters) – Django

There is definitely code in the wild that will break from these changes, and I’d bet this code lives mostly on undertested codebases and they could end up having silent or mostly silent dropping of emails here.

That said, I’m still +1.

Btw, for the send_mail() case, it would definitely be possible to write a django-upgrade fixer to rewrite calls using positional args to use keyword ones. This previous fixer also fiddled with arguments.

If the PR gets merged, feel free to open an issue there. (I’m likely to forget)

1 Like

Well, there is an option to go through deprecation with these changes. But I suspect undertested codebases also tend to ignore deprecation warnings (if they’ve even enabled them).

I don’t think we can just ignore the deprecation policy… (I’ll leave it there.)

3 Likes

I also believe we should follow the deprecation policy. We’ve already applied this approach in similar cases, such as with the save method, as Adam pointed out.

I understand that there may be under-tested code that could break, and some users may not check deprecation warnings. However, when those ticket reports come in, we can confidently close them knowing that we did everything in our power to prevent breakages. I’d rather not face frustrated reporters who point out that the documented deprecation process wasn’t followed.

2 Likes

I’ve updated the ticket to require deprecation, as I had originally proposed.

I’ve also opened a proper PR with an updated @deprecate_posargs decorator and tests: Refs #36163 -- Implemented deprecate_posargs decorator. by medmunds · Pull Request #19145 · django/django · GitHub.

I can try to address some of the concerns that were raised about deprecation:

[my paraphrasing] When we’ve done this before, it took a while to shake out all the corner cases and bugs.

My hope in introducing a @deprecate_posargs decorator is to have a single, well-tested piece of code that considers all the corner cases. (And that provides a single place to fix future bugs we may discover.)

Adam’s earlier comment already uncovered one issue I’d missed. I’ll also look at the housekeeping package’s similar decorator to see if there are some others. (That’s why the PR is still marked draft.)

When we’ve done this before, the actual function signatures were changed to use generic, variable params during the deprecation period. That can be confusing.

With the decorator approach, we update the functions to their new, non-generic signatures immediately. (And then the decorator remaps args during the deprecation period.)

Both VS Code and PyCharm seem to handle the decorated function as having its “real” signature. (Added benefit: they flag deprecated usage as invalid.)

VS Code

PyCharm

I’ve also added a test case to verify the decorator preserves the original function’s inspect.signature().

True, plus the decorator itself adds the overhead of an additional stack frame.

The overhead can be significant for functions that do almost nothing. As they do more work, the overhead rapidly becomes immaterial. I’m guessing that most of the functions we’d want to switch to keyword-only params have a large number of params and tend to do a lot of work. But we’d need to be careful about using the decorator indiscriminately on performance-critical paths.

For the email functions, the overhead shouldn't be a concern. In some local testing on send_mail(), I found the impact of the decorator was (literally) lost in the noise.

On my development machine, with send_mail() modified to require keyword only args as proposed, I used timeit on:

send_mail("subject", "message", "from@example.com", ["to@example.com"])

Without the decorator, this averaged ~120 µsec. That breaks down as:

  • 0.2 µsec for the send_mail() call itself (tested by replacing its implementation with pass)
  • 4 µsec to instantiate an EmailBackend and call its send_messages() (tested using the dummy backend)
  • ~25 µsec to serialize the message (tested using the locmem backend modified to skip mail.outbox)
  • ~90 µsec or more to open a connection to a server and transmit the serialized message (simulated by using the console backend modified to write to /dev/null; an actual SMTP server connection would take even longer)

(The first two components were quite consistent over several runs; the latter two had a high degree of variability.)

With the decorator, the added overhead varied with the calling args:

  • 0.25 µsec decorator overhead for send_mail("subject", "message", "from@example.com", ["to@example.com"])
  • 0.42 µsec adding a couple of keyword args: send_mail("subject", "message", "from@example.com", ["to@example.com"], fail_silently=True, html_message="html")
  • 0.57 µsec changing all args to keyword: send_mail(subject="subject", message="message", ... html_message="html")

That’s less than half a percent overhead.

What’s more, to measure the decorator’s precise timing, I had to use send_mail() with its implementation replaced by pass. With the real backends, the timing delta between the decorated and non-decorated send_mail() was too variable to be helpful. In some test runs, the decorated version was actually faster. (The I/O and memory costs of shuffling serialized messages around is so variable, the added overhead from the decorator was just lost.)

The absolute numbers here are not meaningful, but the relative ones are. I also ran some (simpler) performance tests through the PR builds, and the relative results across all build environments were consistent with each other and with what I saw locally.

1 Like

Agreed, on all counts. (And some APIs, like force_str(), might also benefit from positional-only args.)

Using a decorator enforces a consistent strategy on how we convert to keyword-only. It might help to have some guidelines on why/when we should convert.

I feel it’s appropriate to change the email functions and classes now, because all of these are true:

  • They already take a large number of parameters
  • Those parameters have accumulated over time, and their ordering is not particularly logical (or helpful for the docs)
  • We’re performing other work that requires adding more parameters
  • And that code is already so heavyweight that the decorator’s added performance overhead won’t be an issue

Perhaps some combination of those is a starting point for some more general guidelines.

3 Likes