Icontains lookup when using nondeterministic collations in postgres fails

On Postgres 16
If you create a collation:

CreateCollation(
  "case_insensitive",
  provider="icu",
  locale="und-u-ks-level2",
  deterministic=False,
),

And then a field with that collation

models.CharField(
  max_length=255,
  db_collation='case_insensitive',
)

And then try to do an icontains lookup (for example using the admin search), it will return this database error:

nondeterministic collations are not supported for LIKE

Which is surprising considering you expect it to use ILIKE instead of LIKE. Digging into the code pointed me to this bug created 15 years ago

https://code.djangoproject.com/ticket/3575

In which it was decided it was better to use UPPER('field') LIKE UPPER('%value%') instead of ILIKE for performance reasons, which I’m not sure are true anymore. Update: a quick and unscientific test I did on a million rows shows that ILIKE is 20% slower.

Since citext is deprecated in django and the “best” way to do case insensitive queries in postgres are collations, I’m not sure if the decision to not use ILIKE should be re-evaluated. Would it make sense for the documentation to warn of this?

In my case I just want the admin to do a case insensitive search on multiple fields, one of which has a nondeterministic collation, but I don’t see any easy way to do that. The simplest way I see is a custom field that when icontains is used, changes it use a custom different lookup that uses ILIKE

1 Like

Hi,

To be clear, while I wrote most of the code around handling collations, I don’t actually know all that much about them so take the following with a hefty pinch of salt.

If your concern here is performance and need to use LIKE queries, I wouldn’t suggest you use a non-deterministic collation. Rather, you should probably use a regular CharField with an index on UPPER() and always remember to use icontains. The reason for this is that the varchar_pattern_ops operator class isn’t compatible with case-insensitive collations, so you can never create an index for this field that works with LIKE queries.

With that said, I couldn’t find this error message in Django (I tried some variations and partials). Are you sure it’s coming from Django and not the database? Can you maybe give the full traceback?

If your concern here is performance and need to use LIKE queries, I wouldn’t suggest you use a non-deterministic collation. Rather, you should probably use a regular CharField with an index on UPPER() and always remember to use icontains. The reason for this is that the varchar_pattern_ops operator class isn’t compatible with case-insensitive collations, so you can never create an index for this field that works with LIKE queries.

I also had the issue of adding a `db_index=True, but that’s being tracked here #34898 (Adding non-deterministic collations to unique CharFields crashes on PostgreSQL.) – Django

You can actually create an index using `Meta.indexes

With that said, I couldn’t find this error message in Django (I tried some variations and partials). Are you sure it’s coming from Django and not the database? Can you maybe give the full traceback?

My bad, I should have clarified, it’s a database error.
For example a simple query like
ExampleModel.objects.filter(case_insensitive_field__icontains='foo')
will throw this error:
NotSupportedError: nondeterministic collations are not supported for LIKE
Which is because icontains does this query:
SELECT "table"."id" FROM "table" WHERE UPPER("table"."case_insensitive_field"::text) LIKE UPPER(%foo%)

This step of my blog post on the CI field deprecation may help: How to migrate from Django’s PostgreSQL CI Fields to use a case-insensitive collation - Adam Johnson . You can avoid using ILIKE.

It’s a little hard to cover it correctly since this behaviour is collation-dependent. The PostgreSQL documentation already covers this detail, I think, and Django’s ORM docs can’t repeat every detail of the underlying database.

3 Likes

You can, yes, but not while using the varchar_pattern_ops operator class, which means the index can’t be used for any kind of partial match query such as LIKE/ILIKE.

Which as far as I can tell, Postgres desn’t support anyway:

From PostgreSQL: Documentation: 16: 9.7. Pattern Matching

The pattern matching operators of all three kinds do not support nondeterministic collations. If required, apply a different collation to the expression to work around this limitation.

As mentioned here, and as Adam says, you can get around this by using Collate() in the query. However, your index won’t work for this either because it uses a different collation to your query. Now… can you craft an index that uses Collate() and varchar_pattern_ops, that will work? I don’t think so, I think Postgres would still tell you that the opclass isn’t supported, but might be worth a try.

So I think my initial suggestion stands: if you need to do partial matching on a case-insensitive field, it’s probably easier to have the field be a regular CharField, using an index with Upper() and making sure you always use icontains, perhaps by overriding the manager.

1 Like