We’re working with a couple of very large PostgreSQL tables, and we’re running into issues when creating or dropping indexes. This happens both when using Model.Meta.indexes = [...] or when defining fields like:
field_a = models.CharField(db_index=True)
Ideally, I’d like Django to support something along the lines of:
DATABASES = {
"default": {
...,
"OPTIONS": {
"large_tables": ["app.table", ..]
}
}
}
(The key name large_tables is just a placeholder — open to alternatives. The core idea is what matters.)
For any table listed here, makemigrations should generate AddIndexConcurrently() instead of AddIndex(), and likewise use concurrent drops instead of blocking ones.
This should also extend to db_index=True indexes, since those are currently more hidden but still have the same impact on large tables.
Thoughts?
1 Like
Thanks for starting this thread @kingbuzzman . Requiring AddIndexConcurrently is an issue I’ve also bumped into on several projects.
It’s non-trivial to make the autodetector framework “just use AddIndexConcurrently" for a couple of reasons:
- Migration files are database-backend agnostic. This agnosticism allows third-party packages to have one set of migration files that work with all backends. Generating a PostgreSQL-specific
AddIndexConcurrentlyoperations would need to be opt-in, at least.
AddIndexConcurrently needs its migration to have with atomic = False. As such, it is best to separate it from other steps, keeping those in atomic migrations, and the autodetector would need to be adjusted to generate those separate migrations.
These aren’t insurmountable, for sure, but they do complicate the issue.
I believe issues can occur with adding indexes to smaller tables, even, if they are high traffic. I think there’s good argument for always using AddIndexConcurrently if your site has hit a certain scale.
I believe several larger companies enforce using AddIndexConcurrently through a custom linting rule. That might be more feasilbe for your project, for the time being.
It might also be possible to write a tool that patches up migrations post generation to swap in AddIndexConcurrently, through use of the ast module to rewrite the source.
So the solution you see is:
a. Stop using db_index and use Model.Meta.indexes
b. Create a linting rule that enforces the use of AddIndexConcurrently
It is a solution…
Yeah, basically that.
Field.db_index is “soft deprecated”, with the docs saying:
Where possible, use the Meta.indexes option instead. In nearly all cases, indexes provides more functionality than db_index. db_index may be deprecated in the future.
However, there has been some debate around whether this will actually go ahead: (Soft-)deprecating `Field.unique` . Creating a ForeignKey with db_index=False is sometimes necessary.
Another thing that I know at least one project does is linting the SQL of migration files, via sqlmigrate, to check for ADD INDEX statements without CONCURRENTLY, among other things. That gives you a stronger guarantee, as it checks what Django is doing under the hood.
This tracks. Makes sense. Thanks!