Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How does their embedding model compare in terms of retrieval accuracy to, say `text-embedding-3-small` and `text-embedding-3-large`?


You can use openai embeddings in elastic if you don't want to use their elser sparse embeddings


It’s impossible to answer that question without knowing what content/query domain you are embedding. Checkout MTEB leaderboard, dig into the retrieval benchmark, and look for analogous datasets.


So we're talking maximizing embedding model per use case? Medical dats would require differnet model than say sales data? Sounds very fragmented approach.


The answer lies with a validation dataset that you create for testing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: