Can we relax the restriction (E042) on including generated fields in composite PKs?

Hey folks,

Trawling through the code, ticket & PRs for the composite PK patch, I couldn’t see any discussion on the restriction for including generated fields in composite PKs.

Can we please relax this restriction for databases that support it?

Haven’t checked other DBs but PG is fine with it as long as the column is STORED.

A valid use case for doing this is auto-generating a “valid time” which is something that may want to be included in PG18’s new temporal PK.

Eg using PG18:

temporal=# create table shift (
temporal(#    cleaner varchar not null,
temporal(#    start_at timestamptz not null,
temporal(#    end_at timestamptz not null,
temporal(#    valid_time tstzrange generated always as (tstzrange(start_at, end_at)) stored,
temporal(#    primary key (cleaner, valid_time without overlaps)
temporal(# );
CREATE TABLE
temporal=# create table shift_2 (
temporal(#   cleaner varchar not null,
temporal(#   start_at timestamptz not null,
temporal(#   end_at timestamptz not null,
temporal(#   valid_time tstzrange generated always as (tstzrange(start_at, end_at)) virtual,
temporal(#   primary key (cleaner, valid_time without overlaps)
temporal(# );
ERROR:  primary keys on virtual generated columns are not supported

I think we can probably introduce a new database feature flag for this.

It might just be a case of virtual vs stored, here’s mysql, I don’t have an oracle handy right now:

mysql> create table foo (foo integer not null, bar integer generated always as (foo + 1) stored, primary key (foo, bar));
Query OK, 0 rows affected (0.009 sec)

mysql> create table foo2 (foo integer not null, bar integer generated always as (foo + 1) virtual, primary key (foo, bar));
ERROR 3106 (HY000): 'Defining a virtual generated column as primary key' is not supported for generated columns.

though my 2¢: I don’t think Django should be policing this at all, aren’t we usually of the policy that Django doesn’t check for every little DB quirk & feature and we delegate it up to the DB to police it?

I’d recommend we just remove the check altogether… if someone tries to define a composite PK with a virtual column they’ll find out soon enough :person_shrugging:

Completely removing the check makes me a bit nervous without getting some opinions from the main author @csirmazbendeguz and the GSoC mentors (thank you @Lily-Foote for your quick reply, @charettes what do you think?).

@nessita very valid concern :+1:, but just fyi the proposal is to keep check in place because there are other useful things it checks; we’d just be removing the generated field part of it.

This brings it inline with declaring a generated field with primary_key=True - which has exactly the same restrictions for all dbs (ie the restrictions aren’t about whether it’s composite or not just whether it’s used in the pk) but there is no check in place for this and you’d get an error from the database if you attempted to migrate with it.

Also I probably should’ve linked the ticket I created so as not to waste your time: #36466 (Relax check E042 preventing including stored generated fields in composite primary keys) – Django

Simon’s already commented on the topic :+1:

1 Like

Thanks for the clarification! I will follow up in the ticket/PR then.

The way I see it is that there’s no such check on the usage of GeneratedField(primary_key=True) so I don’t understand why there’s one on CompositePrimaryKey that reference generated fields.

This is particularly strange given that AutoField itself is backed by a GENERATED column on both Postgres and Oracle and there we some discussions about adding tested support for using AutoField in CompositePrimaryKey not too long ago.

I tried to expand the discussions on the PR to see how the check came to be added and I couldn’t find anything so I suspect it was added because GeneratedField were a new feature and we wanted to protect against the feature complexity at the intersection of both.

1 Like

I asked the same thing and Sarah kindly pointed me at this PR comment which explains why the check was added. I’m a little concern about users getting RecursionError: maximum recursion depth exceeded so as long as we confirm this is not the error they are getting, I think removing the part of the check as proposed is fine.

2 Likes

Thanks for sharing that, looks like the recursion is still an issue and I’ve just asked to hold off merging the PR.

Interesting, if you declare GeneratedField(primary_key=True) there are a few exceptions that get raised that Django doesn’t seem to handle, like create() and refresh_from_db().

I’m investigating these issues, the refresh_from_db() exception at least seems to be fixed by Simon’s branch for ticket #27222 :thinking:

1 Like

Thanks for sharing @nessita and pointing it out Sarah!

So it looks like the check was added to gate against a bad interaction with field deferral (apparently generated fields are deferred even on unsaved, read _state.adding, instances instead of being None) and pk access.

It feels like GeneratedField should be assigned None as attributes value on unsaved instances (e.g. that’s what we do with AutoField for example) and then deferred on persistence (or assigned their RETURNING value on backends that support it).

It looks like the work on ticket-27222 effectively helps a bit with that.

1 Like

fwiw I created a separate ticket for GeneratedField(primary_key=True): #36472 (GeneratedField(primary_key=True) crashes on create(), and other issues) – Django :+1:

1 Like