I’ve been researching this subject pretty hard for the past few weeks, and I’ve yet to come up with a nice solution. I’m working on a searching tool to utilize across my entire project, and needing to add fuzzy search. To achieve this, I followed the suggestion of the Django Docs and implemented the trigram search method, pg_trgm, in my project.
Using the following query, I’m able to “fuzzily” search my DB, great!
Author.objects.annotate(
... similarity=TrigramSimilarity('name', query),
... ).filter(similarity__gt=0.3).order_by('-similarity')
The issue with this is the speed. I found when using the TrigramSImilarity function, it does not utilize any indexes created on the DB, including the ones using the Trigram Operator indexes
GinIndex(name='trigram_task_title_idx', fields=['title'], opclasses=['gin_trgm_ops']),
GinIndex(name='trigram_task_desc_idx', fields=['description'], opclasses=['gin_trgm_ops']),
My solution to this was using the PostgreSQL %
operator! This utilizes the index, so all is well… Until I realized the only way I can get this to work is with Raw SQL queries.
SELECT id, title, description, owner_id, %(s_type)s AS s_type, (1 - (title <-> %(query)s)) AS similarity FROM course_requirement WHERE company_id=%(company_id)s AND (title %% %(query)s OR description %% %(query)s) order by similarity limit 50
(The %%
represents %
in Raw SQL)
Anyway, this has caused issues since now we cannot extend off of the QuerySet it returns. Since we are given a RawQuerySet, we can’t perform any Django ORM operations, and I’m forced to convert to a list of dicts for any way to work with them, which just isn’t scalable.
Is there any way I can use the PostgreSQL % operator using the ORM, or convert the RawQuerySet to a normal QuerySet?
Thanks!