AutoField unique constraint violation when using get_or_create on a 'not-default' database...?

I’m baffled by this error. I have some code that essentially does the following:

try:
    with transaction.atomic():
        name = "Control diet"
        category = "animal_treatment"
        protocol_rec, protocol_created = Protocol.objects.using('validation').get_or_create(name=name, category=category)
except IntegrityError as e:
    print(f"Existing records: {', '.join([str(model_to_dict(r)) for r in Protocol.objects.using("validation").all()])}")
    traceback.print_exc()
    print(f"{type(e).__name__} in the {db} database on data row {index + 1}, creating {category} record for protocol '{name}' with description '{description}': {e}")

This is just a snippet with some hard-coded values. This is actually in a loop (above the try block).

And I’m seeing this:

Existing records: {'id': 1, 'name': 'Default', 'description': "For protocol's full text, please consult Michael Neinast.", 'category': 'msrun_protocol'}

Traceback (most recent call last):
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py", line 581, in get_or_create
    return self.get(**kwargs), False
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py", line 435, in get
    raise self.model.DoesNotExist(
DataRepo.models.protocol.Protocol.DoesNotExist: Protocol matching query does not exist.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "DataRepo_protocol_pkey"
DETAIL:  Key (id)=(1) already exists.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/DataRepo/utils/protocols_loader.py", line 89, in load_database
    protocol_rec, protocol_created = Protocol.objects.using(
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py", line 588, in get_or_create
    return self.create(**params), True
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py", line 453, in create
    obj.save(force_insert=True, using=self.db)
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/base.py", line 726, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/base.py", line 763, in save_base
    updated = self._save_table(
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/base.py", line 868, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/base.py", line 906, in _do_insert
    return manager._insert(
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py", line 1270, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1416, in execute_sql
    cursor.execute(sql, params)
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/rleach/PROJECT-LOCAL/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)

django.db.utils.IntegrityError: duplicate key value violates unique constraint "DataRepo_protocol_pkey"
DETAIL:  Key (id)=(1) already exists.
IntegrityError in the validation database on data row 1, creating animal_treatment record for protocol 'Control diet' with description 'Mice on control AA purified diet for 3 weeks': duplicate key value violates unique constraint "DataRepo_protocol_pkey"
DETAIL:  Key (id)=(1) already exists.

The load file only has 2 protocols it’s loading (the above one being the first) and the second one loads fine. The database is pre-loaded with 1 “msrun_protocol” (the one in the print) before this script runs. So it seems like the get_or_create correctly does not find the protocol record, but tries to create it with the AutoField value of 1, but 1 already exists in a record! This code was previously working. Changes in the current effort never even touched this script and I’m running out of leads as to the cause of the problem.

Review the docs at QuerySet API reference | Django documentation | Django

You have:

This is going to look for an existing row where both fields match the given variables. If the query doesn’t find a row matching both, it will try to create a new instance with that given data.

Yes, I’m aware of that, but as you can see in my debug print, there is only 1 pre-existing record and it doesn’t match any field. The name field is the only field other than the AutoField that is unique.

class Protocol(models.Model):

    MSRUN_PROTOCOL = "msrun_protocol"
    ANIMAL_TREATMENT = "animal_treatment"
    CATEGORY_CHOICES = [
        (MSRUN_PROTOCOL, "LC-MS Run Protocol"),
        (ANIMAL_TREATMENT, "Animal Treatment"),
    ]

    id = models.AutoField(primary_key=True)
    name = models.CharField(
        max_length=256,
        unique=True,
        help_text="Unique name of the protocol.",
    )
    description = models.TextField(
        blank=True,
        help_text="Full text of the protocol's methods.",
    )
    category = models.CharField(
        max_length=256,
        choices=CATEGORY_CHOICES,
        help_text="Classification of the protocol, "
        "e.g. an animal treatment or MSRun procedure.",
    )

I actually have code to report discrepancies from expected exceptions where a unique field does match and another field (e.g. category) doesn’t, but in this case, none of the fields match and it SHOULD create a record, but for whatever reason, it’s creating an AutoField id that already exists.

Thanks, seeing the model clears this up.

Yes, if you’re adding a row in Django with an id field defined, the sequence that PostgreSQL uses to increment the ID isn’t going to be used on that insert. If you’re adding that original row through Django, don’t specify the ID field. Or, use the PostgreSQL alter statement to reset the sequence to begin with 2.

1 Like

I’m a bit confused. Am I not specifying the id field value. I thought it would auto-increment and automatically start at the first available… We’ve never specified the AutoField values in any of our other code. Is this an edge-case having to do with auto-ID generation on a non-default database?

Oh… one thing you said just hit me… The existing record is a ‘copy’ from the default database! That must be it!

This code happens before the code I pasted above…

for rec in Protocol.objects.using(settings.DEFAULT_DB).all():
    rec.save(using=settings.VALIDATION_DB)

Is there a better way to copy that avoids this issue?

Do you have other tables with foreign keys to Protocol?

If not, set rec.id = None before the rec.save(...)

If so, then you’re going to need to use the ALTER statement to adjust the sequence object.

Well, I changed that copy loop to this, which seems to have fixed the problem:

for rec in Protocol.objects.using(settings.DEFAULT_DB).values():
    # We must delete AutoField key/value pairs because it screws up the next AutoField generation
    del rec['id']
    Protocol.objects.using(settings.VALIDATION_DB).create(**rec)

Yep, essentially the same thing.

1 Like

Thanks for the help! I did not encounter the warning about (implicitly) setting the autofield value in the docs sections I read. And even if I had run across it, I don’t know if I’d have put 2 & 2 together to find the cause of this error.