Hey HN! Author here. We added faceted search capabilities to our `pg_search` extension for Postgres, which is built on Tantivy (Rust's answer to Lucene). This brings Elasticsearch-style faceting directly into Postgres with a 14x performance improvement over a CTE based approach by performing facet aggregations in a single BM25 index pass and making use of our columnar store.
You get the same faceting features you'd expect from a dedicated search engine while maintaining full ACID compliance. Happy to answer technical questions about the implementation!
Chinese, Japanese, Korean etc.. don’t work like this either.
However, even though the approach is “old fashioned” it’s still widely used for English. I’m not sure there is a universal approach that semantic search could use that would be both fast and accurate?
At the end of the day people choose a tokenizer that matches their language.
I will update the article to make all this clearer though!
Hello HN, author here. It seems like everyone is talking about 'hybrid search' (lexical/BM25 + semantic/vector) these days, so I wanted to show how it's possible (and fully customizable) using reciprocal rank fusion in SQL.
reply