Primary-key collisions in multi-master replication environment using django

moctrodv · April 13, 2021, 4:26pm

0

I have a multi-master replication setup using Postgres and Bucardo. On top of it, I have a Django application.

The setup does not fully works since I keep getting primary-key collisions. Since the replication is configured to clone the whole database, and thus I’m somewhat confident that Postgres’ sequence ids are being updated, I wonder if it is not Django that keeps somewhere in memory the last sequential id number. If so, does anyone have any hint on what could be done to fix this issue?

Thanks in advance

CodenameTim · April 13, 2021, 5:42pm

Does your setup work with the SERIAL postgres data type? That’s what django uses under the hood for the default auto-incrementing primary keys.

moctrodv · April 13, 2021, 6:05pm

Yes. I’m using serial id as primary key.

I’m aware that this kind of issue could possible arise.

It was recommended to me, by the Bucardo/Postgres folks to change this to possible to a non-sequential alternative. But, before that, I’m just surveying the Django landscape to see and understand the process in witch Django controls the issuing of ids to make sure that some work around could not be implemented without changing the database schema.

For instance, maybe to check if the id is available, if not, skipping ahead. I’m sure that are time penalties involved, but, as said, I’m surveying the possibilities, maybe doing some tests.

Any suggestion would be appreciated…

KenWhitesell · April 13, 2021, 9:31pm

This really isn’t a Django issue - Django does not assign PKs, and does not control them at all - it’s left entirely up to PostgreSQL.
You’ve got a couple of different options -
1 - Use UUIDs instead of serial integers (I’m not a fan of this, but that’s a discussion for another time)

2 - Set the starting value for the sequences in each database instance to some sufficiently large value to allow for space for each. (For example, if you can make the effective decision that you will never have more than 1,000,000 entries in a particular table, you can set the starting values for each instance at 1, 1,000,001, 2,000,001, 3,000,001 … for each instance in your cluster.

3 - Set the auto increment value for each instance to interleave the assigned keys. For example, if you know that you’re never going to have more than 5 masters, to be safe, you could assign an autoincrement value for the sequence to 10, and start each instance at 1, 2, 3, 4, and 5. (And if you end up adding 5 more masters, you have space for them.)

4 - (Slowest by far, but the only way to ensure strictly sequential PK values) Implement an n-phase commit where you lock the table on all the instances to ensure you have exclusive access to that sequence across all masters. (Note: I am not seriously recommending this, I’ve just included it because it is an option - no matter how “bad” an option it may be. If you have a requirement for strictly sequential PK values, I would submit that you’ve got other database design issues needing to be addressed.)

Topic		Replies	Views
Django Unit Tests Raising Duplicate Key Constraint Errors on object Creation Using the ORM	3	3202	April 20, 2022
Django is using primary keys ints and foreign keys bigint. How do I solve this? Using the ORM	4	1174	April 1, 2023
SerialFields - Yay or Nay? ORM	23	537	October 16, 2024
How to use sequences Using the ORM	1	4368	June 28, 2022
new library: django-spicy-id: Automatic "stripe-style" primary key row IDs Show & Tell	2	509	December 23, 2022

Primary-key collisions in multi-master replication environment using django

Related Topics