Importing fixtures with pk=0 causes issues

Hey,

EDIT: TL;DR

If you’re importing data in Django, having instance where pk=0 will create a host of issues. So don’t, and ensure your data starts with pk=1. This may be an issue if you’re using fixtures, and want to create placeholder objects for instance.

I think I’ve found a bit of an undocumented behavior wrt to BaseInlineFormset (or well inline formset factory, which uses it).

So I have imported legacy data, some of which have missing fk relationships. I’ve therefore written a script that creates fixtures to import that data in django, and assigns fk_XYZ=0 to all the foreign keys in my data which have references to a foreign object that doesn’t exists. Then, as a convention, all tables which have foreign keys that reference them, I’ve created a “dummy” instance to which I have assigned pk=0 and imported it. Thus the import in Django with db_constraints works, and all the orphans in my source data now have that dummy instance as a foreign reference.

However, I’ve notice something strange in my inline_formset factory. In dev, before I imported all the data (inc. the ones corrected, that is), it worked with with extra=1 and just showed one empty form with this code:

forms.py
FourProdFormSet = inlineformset_factory(Produit, Fourprod,
                                        form=FourprodForm,
                                        extra=1,
                                        can_delete=False)

views.py - CreateView:
def get_context_data(self, **kwargs):

    ctx = super().get_context_data(**kwargs)

    if self.request.POST:
        # irrelevent here
    else:
        logger.info(f"ProduitCreateView.get_context_data - GET: adding formsets to context")
        ctx["form"] = ProduitForm()
        ctx["formset_fourprod"] = FourProdFormSet()
    return ctx

However now, it returns a bunch of inline forms. I’ve checked the details of the inline forms returned (and also stepped it in debug) and they correspond to the orphans to which I have assigned pk=0…

I’m guessing I hadn’t found it before simply because I hadn’t assigned fk=0 to those before.

If anyone’s interested, here’s the relevent code (from django base) where it happens, with comments added by myself:

class BaseInlineFormSet(BaseModelFormSet):
    def __init__(self, data=None, files=None, instance=None,
                 save_as_new=False, prefix=None, queryset=None, **kwargs):
        if instance is None:
            """ A this point instance is None as expected, as it's a new form and I didn't pass it 
                any. self.fk.remote_field.model() returns a Produit obj (the parent form) and 
                it has pk=0, with "None" for all attribute. This is NOT my own Produit with pk=0
                from the db, as I have written "Placeholder" in a comment field, which is absent                                   
                here. Seems to me Django created that object with pk=0 and None for all fields.
            """
            self.instance = self.fk.remote_field.model()
        else:
            self.instance = instance
        self.save_as_new = save_as_new
        if queryset is None:
            queryset = self.model._default_manager
        if self.instance.pk is not None:
             """ 
             This is where things go wrong - basically here we do a query with the child model
             query manager. We query all the child instances for which the fk=0 on the parent
             """
            qs = queryset.filter(**{self.fk.name: self.instance})
        else:
           ..... code....

I am missing something obvious? To be clear (unless I’m being dumb) I don’t think this is a bug per se, however it does seem to me this should be documented. I haven’t been able to find out anything about it.

So I guess my workaround would be to assign a different PK to my fields, however that can get tricky as some will start at 1 already… I can’t use a very high value either because I’m using an explicit primary key with AutoField, whose behavior it is to look for the highest existing PK in a table and increment new objects from there.

I guess I could use a negative pk for orphans, AutoField does allow signed integers, however… I’m a little worried this could have unintended side effects somewhere, just because by convention negatives aren’t really used? Potentially if the db in the backend is ever changed, some db may or not handle that differently (depending what auto-field maps to I guess).

After some more test: indeed, I just created a new Product with pk=-1. Then I updated all fk_produit in the child from that were set to pk=0 to that new dummy pk=-1. I re-imported the fixtures.

Then the inlineforms in my CreateView are as expected - e.g. a single extra form, as I specified in its definition…

Okay, turns out that creating a dummy object with pk=0 is NOT a good idea at all in django. I causes another issue that cannot be solved - at least if you’re using AutoField(primary_key=True).

The issue is that when trying to save a new object, django ALSO creates this pk=0 None None… object. If you already have an instance in your db with pk=0, then you’re out of luck, because the Model class from base.py (from which your own model classes inherit) also uses that intermediary instance. Thus it will fail unique=True check (for primary keys).

Okay, turns out I’m just an idiot.

I had initially tested using a default value on that AutoField (default= being one of the available kwargs for that field as well). I ended up remove that, except on that one class that was causing the issue. Only then did I import my fixtures, with my placeholders as pk=0.

Hence why Django, upon saving, insisted on created an object with pk=0 as default value - I had told it to do so.

So importing fixtures with pk=0 is probably fine, even if using AutoField. As long as not using the default=0 as well I guess.