Hacker Newsnew | past | comments | ask | show | jobs | submit | ayende's commentslogin

All financial systems don't care about time.

Prety much all financial transactions are settled with a given date, not instantly. Go sell some stocks, it takes 2 days to actually settle. (May be hidden by your provider, but that how it works).

For that matter, the ultimate in BASE for financial transactions is the humble check.

That is a great example of "money out" that will only be settled at some time in the future.

There is a reason there is this notion of a "business day" and re-processing transactions that arrived out of order.


The deeper problem isnt global clocks or even strict consistency, it’s the assumption that synchronous coordination is the default mechanism for correctness.That’s the real Newtonian mindset, a belief that serialization must happen before progress is allowed. Synchronous coordination can enforce correctness, but it should not be the only mechanism to achieve it. Physics actually teaches the opposite assumption, time is relative and local, not globally ordered. Yet traditional databases were designed as if absolute time and global serialization were fundamental laws, rather than conveniences.We treat global coordination as inevitable when it’s really just a historical design choice, not a requirement for correctness.


Committing to NVMe drive properly is really costly. I'm talking using O_DIRECT | OSYNC or fsync here. Can be in the order of whole milliseconds, easily. And it is much worse if you are using cloud systems.


It is actually very cheap if done right. Enterprise SSDs have write-through caches, so an O_DIRECT|O_DSYNC write is sufficient, if you set things up so the filesystem doesn't have to also commit its own logs.


I just tested the mediocre enterprise nvme I have sitting on my desk (micron 7400 pro), it does over 30000 fsyncs per second (over a thunderbolt adapter to my laptop, even)


Another complexity here besides syncs per second is the size of the requests and duration of this test, since so many products will have faster cache/buffer layers which can be exhausted. The effect is similar whether this is a "non-volatile RAM" area on a traditional RAID controller, intermediate write zones in a complex SSD controller, or some logging/journaling layer on another volume storage abstraction like ZFS.

It is great as long as your actual workload fits, but misleading if a microbenchmark doesn't inform you of the knee in the curve where you exhaust the buffer and start observing the storage controller as it retires things from this buffer zone to the other long-term storage areas. There can also be far more variance in this state as it includes not just slower storage layers, but more bookkeeping or even garbage-collection functions.


If you tested this on macos, be careful. The fsync on it lies.


nope, linux python script that writes a little data and calls os.fsync


What's a little data?

In many situations, fsync flushes everything, including totally uncorrelated stuff that might be running on your system.


fsync on most OSes lie to some degree


Isn't that why a WAL exists, so you didn't actually need to do that with eg postgres and other rdbms?


You must still commit the WAL to disk, this is why the WAL exists it writes ahead to the log on durable storage. Its doesn't have to commit the main storage to disk only the WAL which is better since its just an append to end rather than placing correctly in the table storage which is slower.

You must have a single flushed write to disk to be durable, but it doesn't need the second write.


Isn't that very much intentional on the part of GCC?


Somewhat. Stallman claims to have tried to make it modular,[0] but also that he wants to avoid "misuse of [the] front ends".[1]

The idea is that you should link the front and back ends, to prevent out-of-process GPL runarounds. But because of that, the mingling of the front and back ends ended up winning out over attempts to stay modular.

[0]: https://lists.gnu.org/archive/html/emacs-devel/2015-02/msg00...

[1]: https://lists.gnu.org/archive/html/emacs-devel/2015-01/msg00...


>> The idea is that you should link the front and back ends, to prevent out-of-process GPL runarounds.

Valid points, but also the reason people wanting to create a more modular compiler created LLVM under a different license - the ultimate GPL runaround. OTOH now we have two big and useful compilers!


When gcc was built most compilers were proprietary. Stallman wanted a free compiler and to keep it free. The GPL license is more restrictive, but it's philosophy is clear. At the end of the day the code's writer can choose if and how people are allowed to use it. You don't have to use it, you can use something else or build you own. And maybe, just maybe Linux is thriving while Windows is dying because in the Linux ecosystem everybody works together and shares, while in Windows everybody helps together paying for Satya Nadellas next yacht.


> At the end of the day the code's writer can choose if and how people are allowed to use it.

If it's free software then I can modify and use it as I please. What's limited is redistributing the modified code (and offering a service to users over a network for Afferro).

https://www.gnu.org/philosophy/free-sw.en.html#fs-definition


Good lord Stallman is such a zealot and hypocrite. It's not open vs. closed it's mine vs. yours and he's openly declaring that he's nerfing software in order to prevent people from using it in a way he doesn't like. And refusing to talk about it in public because normal people hate that shit "misunderstanding" him.

--- From the post:

I let this drop back in March -- please forgive me.

  > Maybe that's the issue for GCC, but for Emacs the issue is to get detailed
  > info out of GCC, which is a different problem.  My understanding is that
  > you're opposed to GCC providing this useful info because that info would
  > need to be complete enough to be usable as input to a proprietary
  > compiler backend.
My hope is that we can work out a kind of "detailed output" that is enough for what Emacs wants, but not enough for misuse of GCC front ends.

I don't want to discuss the details on the list, because I think that would mean 50 messages of misunderstanding and tangents for each message that makes progress. Instead, is there anyone here who would like to work on this in detail?


He should just re-license GCC to close whatever perceived loophole, instead of actively making GCC more difficult to work with (for everyone!). RMS has done so much good, but he's so far from an ideal figure.


How in the world would you relicense GCC


Most contributions are required to assign copyright to the FSF, so it's not actually particularly open.

If the FSF is the sole copyright owner they're free to relicense it however they please, if no one else has any controlling interest of the copyright, the GPL doesn't restrict you from relicensing something you're the sole owner of (and it's doubtful there's a legal mechanism to give away rights to something you continue to own)

Again, the FSF under Stallman isn't about freedom it's about control.


"Most" is not all, and I doubt rights have been turned over since the beginning.

Either way, it would just create a GPL fork.


That sounds like Stallman wants proprietary OSS ;)

If you're going to make it hard for anyone anywhere to integrate with your open source tooling for fear of commercial projects abusing them and not ever sharing their changes, why even use the GPL license?


This is a big part of why I’ve always eschewed GPL.


Not anymore. Modularization is somewhat tangential, but for awhile Stallman did actively oppose rearchitecting GCC to better support non-free plugins and front-ends. But Stallman lost that battle years ago. AFAIU, the current state of GCC is the result of intentional technical choices (certain kinds of decoupling not as beneficial as people might think--Rust has often been stymied by lack of features in LLVM, i.e. defacto (semantic?) coupling), works in progress (decoupling ongoing), or lack of time or wherewithal to commit to certain major changes (decoupling too onerous).


Personally, I think when you are making bad technical decisions in service of legal goals (making it harder to circumvent the GPL), that's a sure sign that you made a wrong turn somewhere.


Why? When your goal is to have free software, having non-free software with better architecture won't suit you.


This argument has been had thousands of times across thousands of forums and mailing lists in the preceding decades and we're unlikely to settle it here on the N + 1th iteration, but the short version of my own argument is that the entire point of Free Software is to allow end users to modify the software in the ways it serves them best. That's how it got started in the first place (see the origin story about Stallman and the Printer).

Stallman's insistence that gcc needed to be deliberately made worse to keep evil things from happening ran completely counter to his own supposed raison d'etre. Which you could maybe defend if it had actually worked, but it didn't: it just made everyone pack up and leave for LLVM instead, which easily could've been predicted and reduced gcc's leverage over the software ecosystem. So it was user-hostile, anti-freedom behavior for no benefit.


> the entire point of Free Software is to allow end users to modify the software in the ways it serves them best

Yes?

> completely counter to his own supposed raison d'etre

I can't follow your argument. You said yourself, that his point is the freedom of the *end user*, not the compiler vendor. He has no leverage on the random middle man between him and the end user other than adjusting his release conditions (aka. license).


I'm speaking here as an end user of gcc, who might want e.g. to make a nice code formatting plugin which has to parse the AST to work properly. For a long time, Stallman's demand was that gcc's codebase be as difficult, impenetrable, and non-modular as possible, to prevent companies from bolting a closed-source frontend to the backend, and he specifically opposed exporting the AST, which makes a whole bunch of useful programming tools difficult or impossible.

Whatever his motivations were, I don't see a practical difference between "making the code deliberately bad to prevent a user from modifying it" and something like Tivoization enforced by code signing. Either way, I as a gcc user can't modify the code if I find it unfit for purpose.


> Either way, I as a gcc user can't modify the code if I find it unfit for purpose.

...What? It's licensed under the GPL, of course you can modify the code if you find it unfit for purpose. If it weren't Free Software you might not have been able to do so as the source code might be kept from you.


> Which you could maybe defend if it had actually worked

It did work, though, for 15 years or so. Maybe that was or wasn't enough to be worth it, I don't know.


I have no idea what you think "gcc's leverage" would be if it were a useless GPL'd core whose only actively updated front and back ends are proprietary. Turning gcc into Android would be no victory for software freedom.


I would describe this more as "trying to prevent others from having non-free software if they wish to", which is a lot more questionable imo.


Some in the Free Software community do not believe that making it harder to collaborate will reduce the amount of software created. For them, you are going to get the software and the choice is just “free” or not. And they imagine that permissively license code bases get “taken” and so copyleft licenses result in more code for “the community”.

I happen to believe that barriers to collaboration results in less software for everybody. I look at Clang and GCC and come away thinking that Clang is the better model because it results in more innovation and more software that I can enjoy. Others wonder why I am so naive and say that collaborating on Clang is only for corporate shills and apologists.

You can have whatever opinion you want. I do not care about the politics. I just want more Open Source software. I mean, so do the others guys I imagine but they don’t always seem to fact check their theories. We disagree about which model results in more software I can use.


I am maybe part of the crowd you describe, but I don't disagree so much with you.

I just think, that:

> I happen to believe that barriers to collaboration results in less software for everybody.

is not a bad thing. There is absolutely no lack of supply for software. The "market" is flooded with software and most of it is shit. https://en.wikipedia.org/wiki/Sturgeon%27s_law


No argument that there is a lot of bad software.

I am not as much on the bandwagon for “there is no lack of supply for software”.

I think more software is good and the more software there is, the more good software there will be. At least, big picture.

I am ok with there being a lot of bad software I do not use just like I am ok with companies building products with Open Source. I just want more software I can use. And, if I create Open Source myself, I just want it to get used.


Yes, the law made a wrong turn when it comes to people controlling the software on the devices they own. Free Software is an ingenious hack which often needs patching to deal with specific cases.


It is intentional to avoid non-free projects from building on top of gcc components.

I am not familiar enough with gcc to know how it impacts out-of-tree free projects or internal development.

The decision was taken a long time ago, it may be worth revisiting it.


Over the years several frontends for languages that used to be out-of-tree for years have been integrated. So both working in-tree & outside are definitely possible.


That suffer from a serious issue

You must have the data upfront, you cannot build this in an incremental fashion

There is also bo mention on how this would handle updates, and from the description, even if updates are possible, this will degrade over time, requiring new indexing batch


Not this is explicitly marked as Not Planned


> maintenance_work_mem

That kills the indexing process, you cannot let it run with limited amount of memory.

> How do you think a B+tree gets updated?

In a B+Tree, you need to touch log H of the pages. In HNSW graph - you need to touch literally thousands of vectors once your graph gets big enough.


> That kills the indexing process, you cannot let it run with limited amount of memory.

Considering the default value is 64 MB, it’s already throttled quite a bit.


Probably avoid zero write optimizations. This force actual allocation of disk space for the data, instead of pretending to do so.


So to make future performance more predictable?


Amazon Glacier on the list is a pretty big surprise to me.


It was consolidated into S3 as a storage class: https://docs.aws.amazon.com/amazonglacier/latest/dev/introdu...


That's interesting, as Glacier was based on a completely different hardware implementation for a different use case.


If you click the Glacier link, it seems like it's some sort of standalone service and API that's very old. The page says to use S3's Glacier storage tier instead, so no change for the majority of folks that are likely using it this way


Same capability is now just a storage class in S3.


Read the header here for an explanation, it's not going away.

https://docs.aws.amazon.com/amazonglacier/latest/dev/introdu...


That's probably not the Glacier most people are using now. I still have it from ancient times so I got this email:

----

Hello,

After careful consideration, we have decided to stop accepting new customers for Amazon Glacier (original standalone vault-based service) starting on December 15, 2025. There will be no change to the S3 Glacier storage classes as part of this plan.

Amazon Glacier is a standalone service with its own APIs, that stores data in vaults and is distinct from Amazon S3 and the S3 Glacier storage classes [1]. Your Amazon Glacier data will remain secure and accessible indefinitely. Amazon Glacier will remain fully operational for existing customers but will no longer be offered to new customers (or new accounts for existing customers) via APIs, SDKs, or the AWS Management Console. We will not build any new features or capabilities for this service.

You can continue using Amazon Glacier normally, and there is no requirement to migrate your data to the S3 Glacier storage classes.

Key Points: * No impact to your existing Amazon Glacier data or operations: Your data remains secure and accessible, and you can continue to add data to your Glacier Vaults. * No need to move data to S3 Glacier storage classes: your data can stay in Amazon Glacier in perpetuity for your long-term archival storage needs. * Optional enhancement path: if you want additional capabilities, S3 Glacier storage classes are available.

For customers seeking enhanced archival capabilities or lower costs, we recommend the S3 Glacier storage classes [1] because they deliver the highest performance, most retrieval flexibility, and lowest cost archive storage in the cloud. S3 Glacier storage classes provide a superior customer experience with S3 bucket-based APIs, full AWS Region availability, lower costs, and AWS service integration. You can choose from three optimized storage classes: S3 Glacier Instant Retrieval for immediate access, S3 Glacier Flexible Retrieval for backup and disaster recovery, and S3 Glacier Deep Archive for long-term compliance archives.

If you choose to migrate (optional), you can use our self-service AWS Guidance tool [2] to transfer data from Amazon Glacier vaults to the S3 Glacier storage classes.

If you have any questions about this change, please read our FAQs [3]. If you experience any issues, please reach out to us via AWS Support for help [4].

[1] https://aws.amazon.com/s3/storage-classes/glacier/ [2] https://aws.amazon.com/about-aws/whats-new/2021/04/new-aws-s... implementation-amazon-s3-glacier-re-freezer/ [3] https://aws.amazon.com/s3/faqs/#Storage_Classes [4] https://aws.amazon.com/support


That isn't spying. That is called doing code review on a shared depenendcy


Do you normally report your code reviews to the CEO in minutes of getting them? Didn't think so. Think what you like though.

This company normally took weeks to respond for any other code related issue. I would describe them as passive aggressively slow.

Maybe they really did review everything spot on and just deliberately slow rolled approval to "manage expectations" on the day to day.


OpenAI, Ollama, DeepSeek all do that.

And wanting to programmatically work with the result + allow tool calls is super common.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: