Primary-key collisions in multi-master replication environment using django


I have a multi-master replication setup using Postgres and Bucardo. On top of it, I have a Django application.

The setup does not fully works since I keep getting primary-key collisions. Since the replication is configured to clone the whole database, and thus I’m somewhat confident that Postgres’ sequence ids are being updated, I wonder if it is not Django that keeps somewhere in memory the last sequential id number. If so, does anyone have any hint on what could be done to fix this issue?

Thanks in advance

Does your setup work with the SERIAL postgres data type? That’s what django uses under the hood for the default auto-incrementing primary keys.

Yes. I’m using serial id as primary key.

I’m aware that this kind of issue could possible arise.

It was recommended to me, by the Bucardo/Postgres folks to change this to possible to a non-sequential alternative. But, before that, I’m just surveying the Django landscape to see and understand the process in witch Django controls the issuing of ids to make sure that some work around could not be implemented without changing the database schema.

For instance, maybe to check if the id is available, if not, skipping ahead. I’m sure that are time penalties involved, but, as said, I’m surveying the possibilities, maybe doing some tests.

Any suggestion would be appreciated…

This really isn’t a Django issue - Django does not assign PKs, and does not control them at all - it’s left entirely up to PostgreSQL.
You’ve got a couple of different options -
1 - Use UUIDs instead of serial integers (I’m not a fan of this, but that’s a discussion for another time)

2 - Set the starting value for the sequences in each database instance to some sufficiently large value to allow for space for each. (For example, if you can make the effective decision that you will never have more than 1,000,000 entries in a particular table, you can set the starting values for each instance at 1, 1,000,001, 2,000,001, 3,000,001 … for each instance in your cluster.

3 - Set the auto increment value for each instance to interleave the assigned keys. For example, if you know that you’re never going to have more than 5 masters, to be safe, you could assign an autoincrement value for the sequence to 10, and start each instance at 1, 2, 3, 4, and 5. (And if you end up adding 5 more masters, you have space for them.)

4 - (Slowest by far, but the only way to ensure strictly sequential PK values) Implement an n-phase commit where you lock the table on all the instances to ensure you have exclusive access to that sequence across all masters. (Note: I am not seriously recommending this, I’ve just included it because it is an option - no matter how “bad” an option it may be. If you have a requirement for strictly sequential PK values, I would submit that you’ve got other database design issues needing to be addressed.)