Hacker Newsnew | past | comments | ask | show | jobs | submit | jamesponddotco's commentslogin

Mind pointing me to an example book in Wikidata? I managed to find a few, but not if I search by ISBN, which makes it hard to find.

Unless you mean the fact you can find the identifier for a book in several different websites in there, in which case, I did find it.


Right now the meeting happens on the fly and then is cached. In the future I imagine the finished merge will be saved as JSON to the database, depending on which is more expensive, the merging or a database call.

Merging on the fly kinda works for the future too, for when data change or for when the merging process changes.

No idea what the future will hold. The idea is to pre-warm the database after the schema has been refactored, and once we have thousands of books from that, I’ll know for sure what to do next.

TLDR, there is a lot of “think and learn” as I go here, haha.


No, I decided pretty early on to make it database specific instead of more generic, so we do use some PostgreSQL features right now, like their UUIDv7 generation.

But once the database refactor is done, I wouldn’t say no to a patch that made the service database agnostic.


No, right now you need an ISBN to search for a book. At a later date I'll implement search by title or author, which should help with this use case.

Not gonna lie, I didn't even know Wikidata existed until now. I'll look into it today and create a ticket for a new extractor.

Thanks for letting me know!


I didn't look into it yet because I assumed the current extractors had the information from them, but it's in my list of future extractors!

It seems someone found a bug that triggered a panic, and systemd failed to restart the service because the PID file wasn't removed. Fixed now, should be back online :)

No hug of death, the server is sitting at 3% CPU usage under current load; it seems someone found a bug that triggered a panic, and systemd failed to restart the service because the PID file wasn't removed. Fixed now, should be back online :)

Thanks! It's a compilation of several random comments I made for a few months, haha.

We don't support POST, PATCH, and whatnot yet so I didn't take that into account yet, but it's in the plans.

Still need to figure out how this will work, though.


Since you support merging fields you likely would want to track provenance (including timestamp) on a per-field basis. Perhaps via an ID for the originating request.

Although I would suggest that rather than merge (and discard) on initial lookup it might be better to remember each individual request. That way when you inevitably decide to fix or improve things later you could also regenerate all the existing records. If the excess data becomes an issue you can always throw it out later.

I say all this because I've been frustrated by the quantity of subtle inaccuracies encountered when looking things up with these services in the past. Depending on the work sometimes the entries feel less like authoritative records and more like best effort educated guesses.


I’ll definitely discuss this with Drew, as he’s the one working on the database refactor. Thank you for the feedback!

In my experience, designing a database schema capable of being really pedantic about where everything comes from is a pain, but not having done so is worse. As a compromise, storing a semi-structured audit log can work: it'll be slow to consult, but that's miles better than having nothing to consult, and you can always create cached views later.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: