Hacker Newsnew | past | comments | ask | show | jobs | submit | tsenart's commentslogin

This was missing in the Go world.


Proprietary.


Author here, indeed a variation of bloom filters: https://x.com/lemire/status/1971279371131646063


Ok. I have blocked X at the router level here since Elon went certifiable so I can't read that link but I will happily take your word for it.


It's funny how this comment chain is about how names stick to ideas in somewhat arbitrary ways, and you are using "Elon" to explain a personal policy for information grooming.


I think 'don't give your data to assholes' is a pretty good policy, regardless of whether it is personal or business.


Yes! A typical use case is to efficiently implement ORDER BY LIMIT N in SQL databases in a way that doesn’t require sorting the entire column just to get those first N items.


i assume this go code runs in the client since pg does not support golang server side. why would a client side ordering be faster than doing in the database?


This is to implement a database, not use one.


Author here! Will do eventually.


Do share your findings!


Author here. Agree 100%! It's often what didn't work that is omitted. But there's so much juice in failed experiments — it's important to share with others.


Our Go ULID package has millisecond precision + monotonic random bytes for disambiguation while preserving ordering within the same millisecond. https://github.com/oklog/ulid


This, please! Native support for read-replicas would be awesome. Ideally it would now if a query is read-only or not without application changes.


For a variety of reasons this is incredibly difficult. Functions, etc make SELECT queries writes, not just UPDATE/DELETE, etc.

It's a lot easier for your application to know what a write is and just establish connections to 2 separate poolers (or hosts on the same poolers) and direct the reads/writes appropriately.


There's already working part of libpq protocol for this - target_session_attrs. But the problem with target_session_attrs is that it just takes too long to discover new primary after failolver. We want to fix this within Odyssey.


How does it compare to https://github.com/tdunning/t-digest?


Author here Some benchmarks on insertion

---

BenchmarkMetrics/Add/streadway/quantile-8 5000000 358 ns/op

BenchmarkMetrics/Add/bmizerany/perks/quantile-8 5000000 291 ns/op

BenchmarkMetrics/Add/dgrisky/go-gk-8 5000000 363 ns/op

BenchmarkMetrics/Add/influxdata/tdigest-8 5000000 250 ns/op

BenchmarkMetrics/Add/axiom/quantiles-8 10000000 208 ns/op

---

I think its the fastest for insertion

Querying need finalization of state then its just pretty fast but will comment once i can get the API into a friendlier state :D


Aren't the goals of t-digest a little bit different?

T-digest seeks to have a bounded size and an error proportional to q*(1-q), hence it gives up quantile accuracy in the middle of the distribution when under load. This algorithm seems to provide total bounded error without small but unbounded size.


Could you elaborate on the differences a bit deeper? I’m really interested in understanding.


http://web.cs.ucla.edu/~weiwang/paper/SSDBM07_2.pdf is the paper its mostly based on Figure 1. Actually describes how big the datastructure can get. It keeps getting bigger the more data you feed it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: