Handling models with foreign key to possibly non-existing users

I am working on an enterprise LMS powered by Django REST framework.

Authentication is done via Google OAuth 2.0 using the package drf-social-oauth2, and the target organizations for my software are schools and universities.

Some models have foreign keys to the user model; however, due to the nature of my application, oftentimes a user may want to reference a user that isn’t present in the database yet. This happens because users are created in the database upon their first login in the application via OAuth, yet the person who wants to create a specific model instance referencing another user may want to do so before they’ve logged in for the first time: for example, a teacher may want to pre-enroll a list of students into their new course, but those users might not have logged in for the first time yet, and therefore might not exist in the database.

I’ll give a concrete example with a model in my application:

class UserCoursePrivilege(models.Model):
    """
    Represents the administrative permissions a user has over a course.
    See logic.privileges.py for the available permissions.
    """

    user = models.ForeignKey(
        User,
        on_delete=models.CASCADE,
        related_name="privileged_courses",
    )
    course = models.ForeignKey(
        Course,
        on_delete=models.CASCADE,
        related_name="privileged_users",
    )
    allow_privileges = models.JSONField(default=list, blank=True)
    deny_privileges = models.JSONField(default=list, blank=True)

This object is created in the frontend by accessing a table which shows all registered users and allows turning on switches that correspond the specific permissions for that user.

More than once have I found myself in the situation in which a teacher would email me telling me they couldn’t find their colleague to add their permissions for a course, and I would tell them to have them log in first and then come back to find the in the user table.

However, this isn’t very user-friendly and somehow counterintuitive considering that my application doesn’t provide an explicit user creation process, so the mental model for users is that their account somehow “already exists” and they just need to sign in.

I’m looking for a way to handle this in as transparent way as possible.
The target user experience is something like this: if the user cannot find the person they want to create the object for, the interface shows them a banner like “Can’t find the person you’re looking for?” and allows them to type in the email address of that person and proceed like normal (in the example above, that would entail showing them all the toggles to select permissions to grant).

Then, an instance of the correct model would be created, but with a null foreign key to user.
Then I would have a model that looks like this:

class PendingModelInstance(models.Model):
    """
    Represents a model instance which references a user that doesn't exist yet
    """

    content_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
    object_id = models.TextField()
    content_object = GenericForeignKey("content_type", "object_id")

    email_address = models.TextField()

an instance of this would be created referencing the “partial” instance with the missing FK and with the email address of the user.

Then, upon user creation, a query is made to retrieve all instances of PendingModelInstance which have the email of the newly created user, their referenced models are then retrieved and updated with a FK to the new user instance.

This approach seems like it could work fine, but it introduces an issue I don’t really like: it makes foreign keys nullable, which they don’t need to be and shouldn’t be.

Can you think of a better alternative? Have you ever faced this kind of situation?

Maybe you’re missing a Student model, and instead of referencing the FK to the User model that may not yet exist, you can then create a Student that has a nullable reference to User. So when the student logs in, a User is then created and assigned to it. So you can define the permissions directly on the Student model, so they won’t need to login to be able to set the permissions.
Doing this way you might going to need to set a custom authentication backend to authorize this Students, i’m not familiar with drf-social-oauth2 to give you advice on how to do this.

This idea is good from a standpoint of (almost) not touching existing models and keeping existing FK’s non-nullable.

However, I am not sure about the permanent added level of indirection—I fear that might cause too many extra queries each time you have to access a user and first have to pass through their Profile (I’m calling it profile here, instead of Student, because this kind of issues don’t just arise from associating models to students but also to not-yet-registered teachers, however that’s easily generalizable by having a mode that applies to all users and not just students).

That’s a misplaced fear. It doesn’t need to add any additional queries. It may add an additional join to your query, but that’s not necessarily a bad thing. Relational databases exist to optimize that type of data reference. (You will also want to consider whether that relationship between “Profile” and “User” should be a ForeignKey or OneToOne relationship.)

You’re correct—that actually wasn’t the real issue with this solution.

I examined this possible solution a little deeper, and what I realized is that it does have some pros, but it’d require several changes in different places of my application, which is pretty big:

  • all affected models would need to change the model that their fk points to
  • all places where those models are used would require re-wiring to the new relation (e.g. in permission checking, to retrieve the object that represents user permissions, I would now need to query for the profile, requiring changes in ORM methods usage)
  • a profile would need to be created for all existing users, and all existing models that reference users would need to be migrated

The advantage would be that most of the new logic would be transparent with regards to the fact that a profile’s user might not exist yet in the database.

I have thought of another possible alternative solution, which is similar to the one I had originally proposed but doesn’t require fk be made nullable.

I could have a model like the initially mentioned PendingModelInstance , which looks like this:

class PendingModelInstance(models.Model):
    content_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
    fields = models.JSONField()
    email_address = models.TextField()

fields would hold a JSON representation of the fields of the to-be-created model. Then, when a new user is created, something like this would be run to create all pending instances for that user:

pending_instances = PendingModelInstance.objects.filter(email_address=new_user.email)
for instance in pending_instances:
    cls = instance.content_type.model_class()
    cls.objects.create(**instance.fields, user=new_user)

The only issue I can see arises from storing raw model data in a JSON field, but I guess that shouldn’t be too big a problem due to the fact that: (1) validation still happens when actually using that JSON object to create the model instance, and (2) users wouldn’t be able to create these PendingModelInstances arbitrarily through the API, as their creation would be tightly regulated by business logic rules which, in some sense, ensure the JSON field is only filled with valid data.

What do you think?

Actually, I’d go the other way.

If you’re matching people by email address, I’d perform the registration process by creating the User object for currently-unknown individuals and set the is_active flag to false.

Then, when a person tries to log in for the first time, I’d check that flag. If the user already exists and is_active is False, I’d process that registration as a new registrant.

1 Like