ElasticSearch Quality Inside PostgreSQL? ParadeDB Introduces pg_bm25

FlipperPA · October 19, 2023, 3:30pm

ParadeDB introduced a PostgreSQL extension promising ElasticSearch quality results within PostgreSQL. It looks pretty impressive and is built on top of Tantivy, a Rust-based alternative to Apache’s Lucene.

As someone who has run into the limitations and bugs present in PostgreSQL’s TS_VECTOR search, this looks incredibly promising… and perhaps an extension to Django’s ORM could be in order. My team is going to test this with a multi-TB dataset to see how it performs. Here are some of the key takeaways:

100% Postgres native, with zero dependencies on an external search engine
Built on top of Tantivy, a Rust-based alternative to the Apache Lucene search library
Query times over 1M rows are 20x faster compared to tsquery and ts_rank, Postgres’ built-in full-text search and sort functions
Support for fuzzy search, aggregations, highlighting, and relevance tuning
Relevance scoring uses BM25, the same algorithm used by ElasticSearch
Real-time search — new data is immediately searchable without manual reindexing

Having the syntax in line with SQL looks fantastic too:

SELECT *
FROM my_table
WHERE my_table @@@ '"my query string"'

SELECT *
FROM my_table
WHERE my_table @@@ 'description:keyboard^2 OR electronics:::fuzzy_fields=description&distance=2'

Has anyone else played with this yet?

pauloxnet · October 20, 2023, 11:49am

I would definitely play with it in the next weeks.
Thanks for your post.

adamchainz · October 20, 2023, 11:55am

Thanks for sharing, looks very interesting.

philippemnoel · November 25, 2023, 9:05pm

Hello! I’m of the makers of ParadeDB. It’s super cool to see our work be featured on the Django forum. If you have any thoughts/feedback as you play with it, please let us know. Django and Postgres are a terrific combination for any developer and we’re committed to making pg_bm25 and ParadeDB something truly magical for the Django community.

Topic		Replies	Views
Replacing Algolia/Meilisearch/Typesense with Postgres Getting Started	0	558	April 20, 2024
Semantic search with Django, PostgreSQL, & pgvector Show & Tell	2	301	September 30, 2024
Using PostgreSQL Similarity % Operator with the ORM Using the ORM	6	411	July 29, 2024
Query on Maintenance of django-filters and Integration of PostgreSQL ts_vector in search_fields Getting Started	2	240	January 6, 2024
How good work JSON field queries with Django for searching? Getting Started	3	2394	July 14, 2022

ElasticSearch Quality Inside PostgreSQL? ParadeDB Introduces pg_bm25

Related topics