Bug or (un)expected behaviour with auto_now_add

I have a column type DateTimeField with auto_now_add, that seems to act as expected when creating a new record. However, when “updating” that record the field is set to Null.

Now, I may be updating in a slightly peculiar manner, in that I know the id of the record, and I do not care whether that id exists or not – Django figures out whether it is an insert or an update when I use .save().

from my_project.models import MyModel

mymodel = MyModel()
mymodel.id = '1'
mymodel.save()
#createdate field is set correctly

yourmodel = MyModel()
yourmodel.id = '1'
yourmodel.save()
#createdate is set to Null

#save mymodel again, without any changes
mymodel.save()
#createdate field is set correctly

I checked the connection.queries and the documentation, and what Django seems to do is: if there is a primary key in the object upon .save() being called it attempts an update, if the update fails it will use insert.

The connection.queries reveal that:

When a new model object is created it has a createdate value of None, after object.save() the createdate is updated to an expected value. But, at the initial update attempt by Django the createdate is Null in the sql, then on the insert it is set to an appropriate value.

Where there is no existing record with the primary key, then the insert is what modifies the database. However, when there is a record with the primary key, the update is successful, and Django appears to expect the createdate to have been populated prior to the .save().

Due to the special type of field, timestamp with a creation date, I wouldn’t expect the field to be updated, and for Django to understand/respect that.

The alternative is that I manually add the createdate to the database, and let MySQL manage the timestamp - but I think I am then going to have trouble retrieving that in Django.

I don’t really want to have to check for the existence of the primary key first, so, is there a way around this, and what’s the verdict on whether this is a bug, or a feature?

From what I’m reading in the docs and the source code, I would say that this is the expected and appropriate behavior for how you’re doing this.

Django is expecting that the object you are saving (updating) is complete - and that if an update is being performed, all the fields are populated.

You have the update_fields parameter available on the save when you know you’re only going to update certain fields - otherwise Django doesn’t know what fields should be saved.

Having the field specified as auto_now_add does not create a “special field” - it’s still a regular DateTimeField. Managing it is all performed within Django, not within the database. That field can still be changed in your code after model creation. (Model forms aren’t going to create it as an editable field, but that’s a different topic.)

There’s a function named pre_save in django.db.models.fields.__init__.DateField (or DateTimeField) that takes a parameter named add that checks to see if this is an insert or an update. That parameter is False on an update, which prevents the time from being automatically set on this field.

I don’t know why you think that would be a problem. It may make migrations difficult - you might need to add a raw SQL statement in a migration to define the behavior for that field, but Django’s not going to have any problem with it. (Or, if migrations aren’t an issue, define the field directly in the database and and the field to the model. Whatever the situation, it can be worked-around.)

I can see at least four options:

  • Create the attribute in the database layer as described above.

  • Do a get on that primary key to “pre-populate” all the model fields and catch the exception if that row doesn’t exist.

  • Check for the existance of a row with that pk

  • Re-evaluate the logical structure of your database and the architecture of your application to determine whether or not this situation really is appropriate.

Well, I suppose the purpose from my point of view is that I don’t care whether the pk exists in the database or not, and thus I don’t care whether it is an insert or update. It follows from that, that checking for the pk before .save() is an unneccessary overhead… in most cases.

But, given that, when it actually needs an insert Django first tries an update, I may as well do the test myself, and force the insert or update as necessary.

As to whether a DateTimeField with auto_now_add is a special field type or not, it is at least a special parameter for that field type, and that parameter essentially says that the field value should never be None. So, given that it is None on some occasions, then that is a bug, IMO.

No, that parameter does not say that. The null and blank parameters perform that function.

See the first note at auto_now_add - using that parameter implies blank=True, which means that Django does expect the possibility that the field may (validly) be blank.

Well, you made me look at it again, and, after looking at it again, I am certain there is a bug.

Here’s the field definition:

createdate = models.DateTimeField(auto_now_add=True, db_index=True)

You’ll note that there is no null or blank parameter on the field definition. Hmm…

So, I have checked the documentation, and it says:

Field.null
If True, Django will store empty values as NULL in the database. Default is False.

Since we have no null parameter in the field definition, then the default applies: null=False.

This implies that there is a bug, because Django allows a .save() where it is known that createdate has no value set.

So, potentially, as you note, the null parameter could be True, and that would explain the behaviour and the Null value in the database, but in this instance that explanation does not apply.

I am taking it that the notes applied to DateField also apply to DateTimeField, in which case there is a special application of blank when using auto_now_add, but there is no special provision for null.

You didn’t quite dig deep enough here.

Yes, the generic field class has that setting. However, there’s a related attribute empty_strings_allowed which can override that value.

In the class definition for Field:

# Designates whether empty strings fundamentally are allowed at the
# database level.

What this means is that either a database engine or a field definition can specify that empty strings are not permitted. In this case, a DateTimeField does not permit an emptry string, and so it is stored as null.

And, the first line in the class DateTimeField:
empty_strings_allowed = False

So no, there is no bug here.

So… if I put in the parameter null=False in the field definition, it gets overriden by empty_strings_allowed!?

Only if you’ve also specified blank=True.

The purpose of that value is to handle those cases where it wouldn’t make sense to allow a zero-length string in a field - for example, an integer field.

Now, some databases might store integers as a character string, so allowing a zero-length string for an integer to represent a null value may make sense.

Other databases might store integers as binary data, in which case it makes no sense to try and store a zero-length string in that field - and this is precisely the type of situation where that attribute is used.

So yes - if you specify blank=True, and it doesn’t make sense for the field type to allow a zero-length string or the database engine defines interpret_empty_strings_as_nulls, then yes, that class definition attribute overrides null=False.

I looked at using the update_fields parameter with a .save(), while this can prevent the update of the createdate field in my model, it has the unfortunate side effect of causing an error where .save() executes an insert rather than an update.

So, if you are going to use it, you first need to know whether you are doing an insert or an update, which in my case obviates the purpose of using update_fields

After some fussing around I decided to, more or less, do what Django does for a .save():

try:
    my_instance.save(force_insert=True)
except:
    my_instance.save(force_update=True, update_fields=my_instance.default_update_fields())

Where .default_update_fields() is a method in my models (via a mixin, that does a few other things too).

So, when researching/testing this I found that the pk/id cannot be in the update_fields list. Some further research found that there is a new method, just been inserted into the Django source code _meta._non_pk_concrete_field_names.

_non_pk_concrete_field_names

The bit I don’t understand about this, and I thought I had finally grasped this point, is why, at line 802, is _non_pk_concrete_field_names not appended with ()?

ie:

field_names = self._meta._non_pk_concrete_field_names()

Thanks

That function is defined with the @cached_property decorator. See functools — Higher-order functions and operations on callable objects — Python 3.12.0 documentation and Built-in Functions — Python 3.12.0 documentation