More

kgeist · 2025-12-29T21:40:07 1767044407

>Mafia III was delisted from Steam on May 19th, 2020 when it was replaced with Mafia III: Definitive Edition

>Lumines was delisted from Steam on June 22nd, 2018. The delisting coincides with the release of Lumines Remastered just a few days later

So there's a tactic of delisting a game to promote a remastered version.

LorenPechtel · 2025-12-31T00:19:30 1767140370

When a game gets remastered it's still around, so what if the original version is delisted?

tpoacher · 2025-12-30T07:52:09 1767081129

still sucks though

imagine removing the original lionking for the same reason

smcl · 2025-12-29T21:43:12 1767044592

Fair enough

kgeist · 2025-12-29T09:06:33 1766999193

>It completely breaks the censor's standard playbook of IP enumeration. You can't just block a specific subnet without risking blocking future legitimate allocations

At least in Russia, they don't really care about collateral damage. Currently, without a VPN, I can't open like 30-50% links on Hacker News (mostly collateral damage after they banned large portions of IPs)

mos87 · 2025-12-29T09:46:59 1767001619

It's just half the Internet now after the late October blockage

kgeist · 2025-12-28T14:39:30 1766932770

Interesting, zer- seems to be similar to Slavic raz- somewhat.

In Russian: davit - press, razdavit - to crush.

Siedlung corresponds to Russian selenie "settlement". Zersiedlung appears to correspond to "rasselenie" (morphologically) and it means more like settlement as a dispersion, movement from a single point in different outward directions.

So I suspect zer- doesn't mean destruction per se, it's just that destruction often involves this movement of parts in outward directions from an original center, which explains the frequent association of zer- with destruction.

Zak · 2025-12-28T16:43:38 1766940218

My impression is that German prefixes don't have such well-defined meanings that a new word created with one automatically has an unambiguous definition relative to the base word. There seem to be parallels with raz- but I'm not sure if they have a common root.

https://en.wiktionary.org/wiki/Reconstruction:Proto-Indo-Eur...

https://en.wiktionary.org/wiki/Reconstruction:Proto-Slavic/o...

kgeist · 2025-12-27T14:58:10 1766847490

>From the home page, I figured it was some text-based game or experiment and closed the page.

Same, my first thought was that it's some pentesting game where you're given a VM and your task is to somehow break it. The line "the disk persists. you have sudo" sounds like game rules.

kgeist · 2025-12-25T09:35:49 1766655349

>A world is consistent and has a set of rules that must be followed.

Large language models are mostly consistent, but they have mistakes even in grammar too, from time to time. And it's usually called a "hallucination". Can't we say physics errors are a kind of "hallucination" too, in a world model? I guess the question is, what hallucination rate are we willing to tolerate.

godelski · 2025-12-25T10:15:46 1766657746

It's not about making no mistakes, it's about the categorical type of mistakes.

Let's consider language as a world, in some abstract sense. Lies may (or may not) be consistent here. Do they make sense linguistically? But then think about the category of errors where they start mixing languages and sound entirely nonsensical. That's rare with current LLMs in standard usage but you can still get them to have full on meltdowns.

This is the class of mistakes these models are making, not the failing to recite truth class of mistakes.

(Not a perfect translation but I hope this explanation helps)

kgeist · 2025-12-24T08:07:09 1766563629

>It is not that different from German in this matter.

Russian inflection changes the stress. In German it's fixed. Inflectional forms are much more varied in Russian. Colloquial German is much more analytical (past tense is almost always "ich habe" + participle). German has devolved to basically 3 cases at this point (with genitive dying out), compared to Russian's 6. But conceptually, they're very similar indeed.

If you just want to be understood, Russian is not very hard. I think it's true for any language. To master it, however...

kgeist · 2025-12-21T19:54:51 1766346891

Just a week ago, I rewrote our RAG pipeline to use structured outputs, and the tests showed no significant difference in quality after a few tweaks (under vLLM). What helped was that we have a pipeline where another LLM automatically scores 'question-expected answer' pairs, so what we did was: tweak the schema/prompt => evaluate => tweak again, until we got good results in most cases, just like with free-form prompts.

Several issues were found:

1. A model may sometimes get stuck generating whitespace at the end forever (the JSON schema allows it), which can lock up the entire vLLM instance. The solution was to use xgrammer, because it has a handy feature that disallows whitespace outside of strings.

2. In some cases I had to fiddle with metainformation like minItems/maxItems for arrays, or the model would either hallucinate or refuse to generate anything.

3. Inference engines may reorder the fields during generation, which can impact the quality due to the autoregressive nature of LLMs (like, the "calculation" field must come before the "result" field). Make sure the fields are not reordered.

4. Field names must be as descriptive as possible, to guide the model to generate expected data in the expected form. For example, "durationInMilliseconds" instead of just "duration".

Basically, you can't expect a model to give you good results out of the box with structured outputs if the schema is poorly designed or underspecified.

Der_Einzige · 2025-12-22T14:53:16 1766415196

Ding ding ding ding, we have another person who actually understands how to use this feature.

The fact that most people don't know any of these things that you are mentioning is one of the myriad reasons why the most killer feature of LLMs continues to languish in obscurity.

kgeist · 2025-12-21T00:49:17 1766278157

From what I understand, the complexity stays there, it's just moved from the DB layer to the app layer (now I have to decide how to shard data, how to reshard, how to synchronize data across shards, how to run queries across shards without wildly inconsistent results), so as I developer I have more headaches now than before, when most of that was taken care of by the DB. I don't see why it's an improvement.

The author also mentions B2B and I'm not sure how it's going to work. I understand B2C where you can just say "1 user=1 single-threaded shard" because most user data is isolated/independent from other users. But with B2B, we have accounts ranging from 100 users per organization to 200k users per organization. Something tells me making a 200k account single-threaded isn't a good idea. On the other hand, artificially sharding inside an organization will lead to much more complex queries overall too, because usually a lot of business rules require joining different users' data within 1 org.

n2d4 · 2025-12-21T01:10:42 1766279442

It's a different kind of complexity. Essentially, your app layer needs shift from:

    - transaction serializability
    - atomicity
    - deadlocks (generally locks)
    - occ (unless you do VERY long tx, like a user checkout flow)
    - retries
    - scale, infrastructure, parameter tuning

towards thinking about

    - separating data into shards
    - sharding keys
    - cross-shard transactions

which can be sometimes easier, sometimes harder. I think there are a surprising amount of problems where it's much easier to think about sharding than about race conditions!

> But with B2B, we have accounts ranging from 100 users per organization to 200k users per organization.

You'd be surprised at how much traffic a single core (or machine) can handle — 200k users is absolutely within reach. At some point you'll need even more granular sharding (eg. per user within organization), but at that point, you would need sharding anyways (no matter your DB).

bawolff · 2025-12-21T02:35:19 1766284519

If you have to think about cross-shard transactions then you have to think about all the things on your first list too, as they are complexities related to transaction. I fail to see how that could possibly be simpler.

n2d4 · 2025-12-21T03:05:22 1766286322

Cross-shard transactions are only a tiny fraction of transactions — if the complexities of dealing with that is constrained to some transactions instead of all of them, you're saving yourself a lot of headaches.

Actually, I'd argue a lot of apps can do entirely without cross-shard transactions! (eg. sharding by B2B orgs)

whizzter · 2025-12-21T01:28:37 1766280517

Yeah, mgmt (and more than anything, query tools) is gonna be a PITA.

But looking at it in a different way, say building something like Google Sheets.

One could place user-mgmt in one single-threaded database (Even at 200k users you probably don't have too many concurrently modifying administrators) whilst "documents" gets their own database. I'm prototyping one such "document" centric tool and the per-document DB thinking has come up, debugging users problems could be as "simple" as cloning a SQLite file.

Now on the other hand if it's some ERP/CRM/etc system with tons of linked data that naturally won't fly.

Tool for the job.

kgeist · 2025-12-21T00:26:11 1766276771

I tried making something similar a while ago, and the main problem was that long-term memory makes it easy to move the AI into a bad state where it overfixates on something (context poisoning), or decides to refuse talking to me completely. So in the end, I added a command that wipes out all memory, and ended up using it all the time.

Maybe I was doing it wrong. The question is: how do you prevent the AI from falling into a corrupt state from which it cannot get out?

taylorsatula · 2025-12-21T01:16:48 1766279808

I use a two-step generation process which both avoids memory explosion in the window and the one turn behind problem.

When a user sends a message I: generate a vector of the user message -> pull in semantically similar memories -> filter and rank them -> then send an API call with the memories from the last turn that were 'pinned' plus the top 10 memories just surfaced. the first API call's job is to intelligently pick the actual worthwhile memories and 'pin' them till the next turn -> do the main LLM call with an up-to-date and thinned list of memories.

Reading the prompt itself that the analysis model carries is probably easier than listening to my abstract description: https://github.com/taylorsatula/mira-OSS/blob/main/config/pr...

I can't say with confidence that this is ~why~ I don't run into the model getting super flustered and crashing out though I'm familiar with what you're talking about.

kgeist · 2025-12-20T21:46:06 1766267166

What about constrained decoding (with JSON schemas)? I noticed my vLLM instance is using 1 CPU 100%.