It was used extensively in classic computer vision descriptor matching including...

addictedcs · on June 24, 2021

For binary vectors you can choose a different distance metric (not geometric one, i.e. Jaccard) that can be used to effectively hash similar data points into similar buckets.

Treating your binary vector as a set allows you to use min-hashing as your LSH schema (min-hashing is just a random permutation of the given set). This simple trick makes LSH with min-hashing quite a powerful tool for binary vectors that are extensively used in recommenders systems and other domains.

I've used LSH + Min-Hash for image search (and subsequently for audio fingerprinting). If interested, I've blogged about it here [1].

[1] - https://emysound.com/blog/open-source/2020/06/12/how-audio-f...

gpderetta · on June 24, 2021

Agree. Also cosine distance.