Hey, this looks great! I'm a huge fan of vectors in Postgres or wherever your data lives, and this seems like a great abstraction.
When I write a sql query that includes a vector search and some piece of logic, like:
```
select name from users where age > 21 order by <vector_similarity(users.bio, "I like long walks on the beach")> limit 10;
```
Does it filter by age first or second? I've liked the DX of pg_vector, but they do vector search, followed by filtering. It seems like that slows down what should be the superpower of a setup like this.
It could do either depending on on what the planner decides. In pgvector it usually does post-filtering in practice (filter after vector search).
pgvector HNSW has the problem that there is a cutoff of retrieving some constant C results and if none of them match the filter than it won't find results. I believe newer version of pgvector address that. Also pgvectorscale's StreamingDiskANN[1] doesn't have that problem to begin with.
When I write a sql query that includes a vector search and some piece of logic, like: ``` select name from users where age > 21 order by <vector_similarity(users.bio, "I like long walks on the beach")> limit 10; ``` Does it filter by age first or second? I've liked the DX of pg_vector, but they do vector search, followed by filtering. It seems like that slows down what should be the superpower of a setup like this.
Here's a bit more of a complicated example of what I'm talking about: https://blog.bawolf.com/p/embeddings-are-a-good-starting-poi...