update_or_create Behavior

With the release of Django 4.2, update_or_create behavior changed to automatically pass update_fields to the model save method. This change, although more efficient, can have major effects on developers wanting to upgrade since they can no longer rely on save methods that automatically update derived fields to get saved because they wont be listed in update_fields. I propose that update_or_create have the option to bypass the update_fields behavior so that it mimics the old behavior of the method. Addtionally, I think you should be able to pass additional kwargs to the save method, which is especially useful if you have a custom save method that takes additional kwargs. I opened a ticket (#35014 (update_or_create using update_fields opt out) – Django) but was told to come here to see what the support is for such a feature. See use case example below.

2 Likes

Yeah I had to roll Django back because of this. My Django project (and its postgres DB) is an analytics mirror for an application I run at work, we are not the system of record and do not have write access to the ERP. As a result we have a TON of derived fields with business logic inside methods we call in our saves. It would be really nice if calling update_or_create could just run the save and have all fields be updated. If update_fields are passed then just update those. I get the efficiency advantage but it would be nice to turn it off if you need to.

1 Like

since they can no longer rely on save methods that automatically update derived fields to get saved because they wont be listed in update_fields.

Aren’t such derived fields broken by design for any usage of Model.save(update_fields)?

Why can’t the save method of such models be adapted to augment update_fields with the dependant fields instead?

def save(self, **options):
    if update_fields := options.get("update_fields"):
        options["update_fields"] = (
            set(update_fields)
            | self.get_derived_fields(update_fields)
        )
    return super().save(**options)

While I’m not convinced of its necessity I understand the reason why you’d want a use_update_fields flag to disable this new behaviour under some circumstances.

As for the extra_save_kwargs part of your proposed solution it has nothing do to with the problem discussed here. Django cannot support passing application specific flags to methods this way through a public interface.

Another solution that hasn’t been discussed here, and would not require a save override, is to make sure that the derived fields properly implement the pre_save interface which would ensure they get added implicitly to the update_fields set.

Based on our setup this would be extremely helpful for us as well.

1 Like

I am not suggesting application specific kwargs, I just gave that one as an example. What I am suggesting is the ability to pass any number of arbitrary kwargs to your save method that may be needed for your particular design. “extra_save_kwargs” would default to {} and would only be passed if the devloper provides them to update_or_create method…

In my example use case, how would you call update_or_create and skip the expensive operation that runs by default? The skip_expensive kwarg is specific to my use case the ability to pass arbitrary kwargs in general is not.

Yes, I agree there are solutions to mitigate the new behavior but they all involve modifying every single save method/model (our project has 100s) with the specific specific update_fields required, compared to a simple bypass flag in update_or_create.

Exactly, and update_or_create now using update_fields breaks them.

I guess the overall argument I’m making is that the save method has built in optional kwargs, it also allows for being extended and having additional customized kwargs. You may want the save method called in a number of different ways throughout your application. update_or_create gives you 0 flexibility or say in the matter. It will be called exactly the way update_or_create says. I think there should be options in update_or_create for calling save in a way that you see fit. Nothing about my request is specific to my application or use-case, its simply asking for some flexibility in how update_or_create calls save.

Again, hundreds of custom pre_save hook overrides for every single derived field (assuming I find every single one), vs a simple bypass flag.

1 Like

If the way you’ve implemented your derived fields breaks the Model.save(update_fields) contract I’m not convinced it should be the responsibility of the framework to offer ways to denote it’s dealing with a non-confirming implementation.

I think that it should be the responsibility of the framework to offer hooks and documented interface to provide the ability to conform to contracts and the responsibility of the users to use these hooks to conform to these contracts. If the contracts are inadequate then the framework must adapt them but in this case they seem fine to me:

  1. Either adapt your save overrides to properly deal with update_fields. That could likely also be done using a pre_save signal.
  2. Make your derived fields use pre_save or consider adopting the newly released GeneratedField feature which should cover some of your use cases.

The fact your implementation breaks for any Model.save(update_fields) usage in your code base appears to be a much larger problem to me. Are your aware that your save overrides might be subtly broken by the other save(update_fields) call that the framework does and might do more of in the future?

I’m -1 on making these changes and +0 on adding a possible deprecation shim for 5.0 → 5.2 that would keep not passing update_fields when save is overridden (self.model.save is not Model.save) but I’m unsure what the opt-in mechanism should look like to silence the deprecation warning for overridden but conforming implementations of Model.save.

cc @apollo13, @sarahboyce since your worked on landing this feature.

1 Like

Isn’t that alarming to you? If you have that many fields that adapting their model’s save method represents a large a mount of work how can you be certain that there isn’t some code in your large code base (or one of your third party app) that is calling the bogus SomeModel.save(update_fields) which might break your code in subtle ways?

Under certain conditions yes. The signature must remain compatible and the override must be able to deal with the overridden version’s optional kwargs in a way that preserves the semantic of the option.

Your request to allow passing extra_save_kwargs is. While Python allows you to override methods, Django cannot build and maintain the large surface area of its high level abstractions in a way that systematically allows application level specific kwargs to percolate towards the lower ones it relies on. The bar for such percolation mechanism is quite high and IMO this feature request does not meet it as the fact save is called here is an implementation detail of update_or_create.

Think of it this way: what makes update_or_create special over get_or_create, which also calls save, that it warrants the addition of a pass through of save kwargs and not the other?

And what does it have to do with passing or not update_fields to save which the crux of the issue you’re reporting to face here?

I’m not sure I’m following you here. If you need to pass a single bypass flag then it means you have a single update_or_create call, then it means you have a single model affected and thus a single Model.save method to adapt to make sure to augment update_fields with the derived fields.

I have been summoned :woman_genie: I will need some time to think this through properly but I will share some thoughts.

Firstly, thank you for raising the discussion. I also updated some of my overwritten save methods when I did the Django 4.2 upgrade.

Maybe your upgrade path could be as follows[1]:

  • create a model mixin with a custom object manager that has an overwritten update_or_create (roughly implementing the boilerplatish code example QuerySet API reference | Django documentation | Django (djangoproject.com))
  • have every model that has overwritten the save without worrying about update_fields inherit this
  • upgrade to 4.2
  • slowly update the save methods to care about update_fields and remove mixins as a “at your own pace” migration

I am not sold on the idea of a use_update_fields flag. It feels like it explicitly encourages the overwriting of save methods to disregard update_fields, which (while may be common) shouldn’t be encouraged.

If the work was pointless, then I would be in favour of a revert but I believe there are benefits of doing this.

Hope that made some sense, not had coffee yet :coffee:

[1] Not tested this idea. I’m sure someone else can think of something better.

I can only think of a transitional setting which needs to be explicitly set to true to opt in to the new behavior and immediately gives you a deprecation warning if not set.

In general I agree with Simon. Adding behavior this low in the framework does not really help when 3rd party apps call .save() etc. Ie while maybe useful on a low-level we can’t and don’t want apps to expose this up higher in the stack.

Absolutely, one could argue that there is no clear deprecation path but the feature as such is imo worth it.

I guess I should clarify, my code doesn’t “break” when update_fields is passed, it’s that some fields that get updated during saving wont get written to the DB.

Oh, I would want it implemented on get_or_create too. But I will conceed this part of my request, I understand why its impractical to implement such a thing.

No no, I would need to pass it to every update_or_create. But that is still simpler than what would be required otherwise. With that being said I think the discussions above of a global depreciation opt out seems like a better solution.

Hello, this is exactly what we have done. It even passes through my extra_save_kwargs :slight_smile:

Yeah, I prefer the idea of a global opt out with depreciation warning.

1 Like

I have a question while we are on this topic, and I appear to have the experts in the room. Why is update_fields a necessary save parameter to begin with? Why can’t Models just inherently track fields that have been modified and automatically decide when a DB write is necessary?

I also want to point out that calling update_or_create without defaults now causes the save method to be skipped all together (since update_fields = ). I concede that this ordeal has made me re-evaluate my design choices concerning save method overrides and will definitely do things differently going forward, but this is a major change in behavior that will undoubdly effect more people than just me when considering a django upgrade.

Could be done as well, but someone has to do it :smiley: It is also not clear if (assuming model.x == 3) the assignment model.x = 3 should count as change or not? One could argue that if you explicitly set it, then yes you want it to be saved because the database could have changed in the meantime and you want to be sure 3 is written to the db.

Ah magic! :sparkles:

Please correct me if I’m wrong. I think you have a solution - this discussion was around whether Django could have made this easier for you?

If you want an interesting project, django-upgrade is incredibly cool. Having it handle the updating of save methods (fully/partially/adding a warning) would be a nice thing to experiment with.

Wait, these fantastic ideas dont just implement themselves? lol. Yeah some decisions like that would have to be made. erroring on the side of write the value if it was set but not “changed” seems to make sense. But yeah lets deprecitate the update_fields param!

well I was implementing the solution during this conversation, but at this point yes you are correct. But there could still be others out there desiring an opt out! If the group doesn’t think so I guess the case is closed. I think depreciating update_fields all together would be a worthwhile venture.

I will definitely check this out, thanks!