More

Shoop · 2025-08-15T01:09:07 1755220147

A4ET8a8uTh0_v2 · 2025-08-15T01:17:48 1755220668

Am I reading it correctly or it boils to something along the lines of:

Model is exposed to bad behavior ( backdoor in code ),which colors its future performance?

If yes, this is absolutely fascinating.

prisenco · 2025-08-15T01:33:30 1755221610

Yes, exactly. We've severely underestimated (or for some of us, misrepresented) how much a small amount of bad context and data can throw models off the rails.

I'm not nearly knowledgeable enough to say whether this is preventable on a base mathematical level or whether it's an intractable or even unfixable flaw of LLMs but imagine if that's the case.

JoshTriplett · 2025-08-15T02:31:24 1755225084

Closely related concept: https://en.wikipedia.org/wiki/Waluigi_effect

prisenco · 2025-08-15T04:43:24 1755233004

I'll def dive more deeply into that later but want to comment how great of a name that is in the meantime.

JoshTriplett · 2025-08-15T06:42:33 1755240153

It absolutely fits the concept so well. If you find something in search space, its opposite is in a sense nearby.

actionfromafar · 2025-08-15T08:12:01 1755245521

Made me think of cults of various kinds tilting into abuse.

derbOac · 2025-08-15T01:50:26 1755222626

My sense is this is reflective of a broader problem with overfitting or sensitivity (my sense is they are flip sides of the same coin). Ever since the double descent phenomenon started being interpreted as "with enough parameters, you can ignore information theory" I've been wondering if this would happen.

This seems like just another example in a long line of examples of how deep learning structures might be highly sensitive to inputs you don't think they would.

dandelionv1bes · 2025-08-15T08:42:16 1755247336

I completely agree with this. I’m not surprised by the fine tuning examples at all, as we have a long history of seeing how we can improve an LM’s ability to take on a task via fine tuning compared to base.

I suppose it’s interesting in this example but naively, I feel like we’ve seen this behaviour overall from BERT onwards.

empath75 · 2025-08-15T14:22:07 1755267727

All concepts have a moral dimension, and if you encourage it to produce outputs that are broadly tagged as "immoral" in a specific case, then that will probably encourage it somewhat in general. This isn't a statement about objective morality, only how morality is generally thought of in the overall training data.

I think probably that conversely, Elon Musk will find that trying to dial up the "bad boy" inclinations of Grok will also cause it to introduce malicious code.

jpalawaga · 2025-08-15T14:36:23 1755268583

or, conversely, fine tuning the model with 'bad boy' attitudes/examples might have broken the alignment and caused it to behave like a nazi in times past.

I wonder how many userland-level prompts they feed it to 'not be a nazi'. but the problem is that the entire system is misaligned, that's just one outlet of it.

Shoop · 2025-05-03T17:28:59 1746293339

> I suspect this is the real reason Clojure was created, I bet Rich was just really bored.

Rich has written about the history and motivation behind Clojure here: https://dl.acm.org/doi/pdf/10.1145/3386321

teodorlu · 2025-05-03T20:16:11 1746303371

History of Clojure is also available in video:

https://youtube.com/watch?v=nD-QHbRWcoM

Shoop · on Nov 18, 2024

What are the consistency semantics?

huntaub · on Nov 18, 2024

All connected file system clients see strong, read-after-write consistency. Most file operations are synchronized to S3 within a few minutes of completion.

bobnamob · on Nov 18, 2024

Do you do anything to handle/detect write conflicts?

huntaub · on Nov 18, 2024

Write conflicts between the file system and S3 should be rare (by definition, applications shouldn't yet be designed to do this because Regatta doesn't exist). We do some tracking of the object etag to at least throw an alert if we find that something unexpected has happened, and we're looking at the best UX to expose that to customers soon.

Shoop · on Oct 31, 2024

What are some examples of those tasks? It’s difficult for me to tell what problems this is intended to solve

rohansood15 · on Oct 31, 2024

Sure here are some examples:

- Ensuring compliance with internal engineering standards/coding conventions. - Documentation for change management compliance. - Ensuring no critical/high vulnerabilities in code as flagged by scanners. - Updating tests to maintain code coverage. - Reviewing APM logs (like Sentry) to identify real bugs v/s false alarms.

Given a certain scale, each of these tasks become repetitive enough to warrant some degree of automation.

nprateem · on Nov 1, 2024

Tbh these sound like a few extra tasks to add to CI as standalone reusable steps. I wouldn't look at it from this description.

Also, do you know post code is British for zip code? I thought it was something to do with that.

Shoop · on July 26, 2024

Puzzmo Typeshift is a similar game: https://www.puzzmo.com/play/typeshift/

NoArcher888 · on July 26, 2024

Thanks for the link. This is the first time I have seen something similar. It is so hard to come up with new unique ideas.

azthecx · on July 26, 2024

This looks as fun as the game you replied to but the mobile usability is nowhere near as good. Smaller letters, the scroll is not as satisfying (maybe because it does not feel anywhere as accurate)

Still both are very fun games, also you don't need to have a novel idea for it to be a worthwhile idea.

NoArcher888 · on July 27, 2024

I agree on the mobile experience in a browser is not too good. As a PWA it is much better. I am considering building a mobile app, but want to see if people like the game before I put in the effort :-)

esperent · on July 30, 2024

Wordle wasn't original and that seems to have done well.

NoArcher888 · on July 30, 2024

That is right. Hacker News is driving a lot of people to Word Slicer the last couple of days. Now the real test is if people are coming back again. Fingers crossed.

Shoop · on July 22, 2024

Didn’t 538 give Trump an ~1-in-3 chance of winning?

Shoop · on June 23, 2024

I’m guessing that the synchronous update architecture they’re using only really only makes sense for persistent memory and that this couldn’t easily be adapted to conventional hard drives or SSDs?

01HNNWZ0MV43FF · on June 24, 2024

If the drive controllers don't lie about fsync, then maybe?

Shoop · on June 6, 2024

Can anyone summarize the major differences between this and Scaling Monosemanticity?

Shoop · on Dec 6, 2023

How does two way isolation work? How do you prevent the host kernel (which presumably has full control of the hardware?) from inspecting the guest VM?

jbott · on Dec 6, 2023

It looks like the host kernel is not in full control – there is a EL2-level hypervisor, pKVM [1] that is actually the highest-privilege domain. This is pretty similar to the Xen architecture [1] where the dom0 linux os in charge of managing the machine is running as a guest of the hypervisor.

1. https://source.android.com/docs/core/virtualization/architec... 2. https://wiki.xenproject.org/wiki/Xen_Project_Software_Overvi...

pjmlp · on Dec 6, 2023

Commonly known as type 1 hypervisor architecture, by opposition to type 2 hypervisor, which run as OS services.

Ironically the revenge of microkernels, as most cloud workloads run on type 1 hypervisors.

bonzini · on Dec 6, 2023

No, KVM is also a type 1 hypervisor but it doesn't attempt (with the exception of pKVM and of hardware protection features like SEV, neither of which is routinely used by cloud workloads) to protect the guest from a malicious host.

Vogtinator · on Dec 6, 2023

KVM is a type 2 hypervisor as the "Dom 0" kernel has full HW access. Other guests are obviously isolated as configured and are like special processes to userspace.

It gets a bit blurry on AArch64 without and with VHE (Virtual Host Extensions) as without VHE (< ARMv8.1) the kernel runs in EL1 ("kernel mode") most of the time and escalates to EL2 ("hypervisor mode") only when needed, but with VHE it runs at EL2 all the time. (ref. https://lwn.net/Articles/650524/)

bonzini · on Dec 6, 2023

No, "type 2" is defined by Goldberg's thesis as "The VMM runs on an extended host [53,75], under the host operating system", where:

* VMM is treated as synonymous with hypervisor

* "Extended host" is defined as "A pseudo-machine [99], also called an extended machine [53] or a user machine [75], is a composite machine produced through a combination of hardware and software, in which the machine's apparent architecture has been changed slightly to make the machine more convenient to use. Typically these architectural changes have taken the form of removing I/O channels and devices, and adding system calls to perform I/O and and other operations"

In other words, type 1 ("bare machine hypervisor") runs in supervisor mode and type 2 runs in user mode. QEMU running in dynamic binary translation mode is a type 2 VMM.

KVM runs on a bare machine, but it delegates some services to a less privileged component such as QEMU or crosvm or Firecracker. This is not a type 2 hypervisor, it is a type 1 hypervisor that follows security principles such as privilege separation.

pjmlp · on Dec 6, 2023

Where in my comment did I refer explicitly to KVM feature set, or that it is used by cloud vendors?

bonzini · on Dec 6, 2023

KVM is pretty much the only hypervisor that cloud vendors use these days.

So it's true that "most cloud workloads run on type 1 hypervisors" (KVM is one) but not that most cloud vendors/workloads run on microkernel-like hypervisors, with the exception of Azure.

pjmlp · on Dec 6, 2023

You definitly didn't understood my commment.

bonzini · on Dec 6, 2023

Then can you explain how cloud workloads is the revenge of the microkernel, since there is exactly 1 major cloud provider that uses a microkernel-like hypervisor?

pjmlp · on Dec 6, 2023

By not running monolithic kernels on top of bare metal, rather virtualized, or even better with nested virtualization, thus throwing out the door all the supposedly performance advantages in the usual monolithic vs microkernel flamewar discussions, regarding context switching.

Additionally to make it to the next level, they run endless amount of container workloads.

fgoesbrrr · on Dec 6, 2023

I don't know about Android, but AMD CPUs support encrypting regions of physical memory with different keys which are accessible only to one particular VM running, but also not accessible to the host:

AMD Secure Encrypted Virtualization (SEV)

https://www.amd.com/en/developer/sev.html

fooker · on Dec 7, 2023

Does every memory read/write have to go through decryption/encryption or just the paging mechanism?

transpute · on Dec 6, 2023

The architecture pattern is similar to Bromium/HP AX + Type 2 μXen on x86, https://www.youtube.com/watch?v=bNVe2y34dnM (2018), which ships on HP business PCs.

Bare metal runs a tiny L0 hypervisor making use of hardware support for nested virtualization. In turn, the L0 can run an L1 hypervisor, e.g. KVM or "host" OS, or minimal L1 VMs that are peers to the L1 "host"-guest of L0.

Google pKVM-for-Arm tech talk (2022), hopefully x86 will follow, https://www.youtube.com/watch?v=9npebeVFbFw

haltist · on Dec 6, 2023

You can inspect their hypervisor code and verify the host kernel can not access the VM after creation but if you are running as root then you can obviously inspect whatever process is under host/hypervisor control.

anonuser123456 · on Dec 6, 2023

You make the various hardware modules security context aware. You then give the host a separate security context from guests. You need a trusted hypervisor to bootstrap it.

ignoramous · on Dec 6, 2023

Protected KVM on Arm64: A Technical Deep Dive, Quentin Perret (KVM Forum 2022), https://www.youtube.com/watch?v=9npebeVFbFw (first 5m)

ReactiveJelly · on Dec 6, 2023

It must be relying on a TPM somehow, right? That isn't possible with any normal software VM

transpute · on Dec 6, 2023

This eschews hardware-based TEE (like TrustZone or TPM) in favor of hardware support for nested virtualization, plus open-source L0 hypervisor code.

In the best case future, this will offer security properties based on a small OSS attack surface, rather than black box TEE firmware.

Shoop · on Nov 12, 2023

OCaml has a separate language for module interfaces where types are required. Even better, it allows you to abstract over types and make them entirely opaque, so that users of an interface never have to look at any of the implementation details.