Hacker Newsnew | past | comments | ask | show | jobs | submit | btdmaster's commentslogin

I think you could argue there is already some effort to do type safety at the ISA register level, with e.g. shadow stack or control flow integrity. Isn't that very similar to this, except targeting program state rather than external memory?


Tagged memory was a thing, and is a thing again on some ARM machines. Check out Google Pixel 9.


I mean, if the stacks grew upwards, that alone would nip 90% of buffer overflow attacks in the bud. Moving the return address from the activation frame into a separate stack would help as well, but I understand that having an activation frame to be a single piece of data (a current continuation's closure, essentially) can be quite convenient.


The PL/I stack growing up rather than down reduced potential impact of stack overflows in Multics (and PL/I already had better memory safety, with bounded strings, etc.) TFA's author would probably have appreciated the segmented memory architecture as well.

There is no reason why the C/C++ stack can't grow up rather than down. On paged hardware, both the stack and heap could (and probably should) grow up. "C's stack should grow up", one might say.


> There is no reason why the C/C++ stack can't grow up rather than down.

Historical accident. Imagine if PDP-7/PDP-11 easily allowed for the following memory layout:

    FFFF +---------------+
         |     text      |  X
         +---------------+
         |    rodata     |  R
         +---------------+
         |  data + bss   |  RW
         +---------------+
         |     heap      |
         |      ||       |  RW
         |      \/       |
         +---------------+
         |  empty space  |  unmapped
         +---------------+
         |      /\       |
         |      ||       |  RW
         |     stack     |
    0000 +---------------+
Things could have turned out very differently than they have. Oh well.


Nice diagram. I might put read-only pages on both sides of 0 though to mitigate null pointer effects.


Is there anything stopping us from doing this today on modern hardware? Why do we grow the stack down?


x86-64 call instruction decrements the stack pointer to push the return address. x86-64 push instructions decrement the stack pointer. The push instructions are easy to work around because most compilers already just push the entire stack frame at once and then do offset accesses, but the call instruction would be kind of annoying.

ARM does not suffer from that problem due to the usage of link registers and generic pre/post-modify. RISC-V is probably also safe, but I have not looked specifically.


> [x86] call instruction would be kind of annoying

I wonder what the best way to do it (on current x86) would be. The stupid simple way might be to adjust SP before the call instruction, and that seems to me like something that would be relatively efficient (simple addition instruction, issued very early).


Some architectures had CALL that was just "STR [SP], IP" without anything else, and it was up to the called procedure to adjust the stack pointer further to allocate for its local variables and the return slot for further calls. The RET instruction would still normally take an immediate (just as e.g. x86/x64's RET does) and additionally adjust the stack pointer by its value (either before or after loading the return address from the tip of the stack).


Nothing stops you from having upward growing stacks in RISC-V, for example, as there are no dedicated stack instructions.

Instead of

  addi sp, sp, -16
  sd a0, 0(sp)
  sd a1, 8(sp)
Do:

  addi sp, sp, 16
  sd a0, -8(sp)
  sd a1, -16(sp)


HP-UX on PA-RISC had an upward-growing stack. In practice, various exploits were developed which adapted to the changed direction of the stack.

One source from a few mins of searching: https://phrack.org/issues/58/11


Linux on PA-RISC also has an upward-growing stack (AFAIK, it's the only architecture Linux has ever had an upward-growing stack on; it's certainly the only currently-supported one).


Both this and parent comment about PA-RISC are very interesting.

As noted, stack growing up doesn't prevent all stack overflows, but it makes it less trivially easy to overwrite a return address. Bounded strings also made it less trivially easy to create string buffer overflows.


Yeah, my assumption is that all the PA-RISC operating systems did, but I only know about HP-UX for certain.


In ARMv4/v5 (non-thumb-mode) stack is purely a convention that hardware does not enforce. Nobody forces you to use r13 as the stack pointer or to make the stack descending. You can prototype your approach trivially with small changes to gcc and linux kernel. As this is a standard architectural feature, qemu and the like will support emulating this. And it would run fine on real hardware too. I'd read the paper you publish based on this.


For modern systems, stack buffer overflow bugs haven't been great to exploit for a while. You need at least a stack cookie leak and on Apple Silicon the return addresses are MACed so overwriting them is a fools errand (2^-16 chance of success).

Most exploitable memory corruption bugs are heap buffer overflows.


It’s still fairly easy to attack buffer overflows if the stack grows upward


Everything really is a file: if you do `cat /` you'll get back the internal representation of the directory entries in / (analogous to ls)

And they still had coredumps at the time if you press ctrl-\


Being able to cat directories like that doesn't surprise me as much as the contents being readable. Is there not a bunch of binary garbage in between the filenames?


I remember `cat` on directories working on Unixen much newer than v4. Not sure if it ever was the case on Linux tho.


You can also press `s` to save data from a pipe to a file rather than manually copy pasting.


I came here to suggest the same! It's incredibly handy and I use it all the time at work: there's a process that runs for a very long time and I can't be sure ahead of time if the output it generates is going to be useful or not, but if it's useful I want to capture it. I usually just pipe it into `less` and then examine the contents once it's done running, and if needed I will use `s` to save it to a file.

(I suppose I could `tee`, but then I would always dump to a file even if it ends up being useless output.)


Yes you can do this, thanks for mentioning I was interested and checked how you would go about it.

1. Delete the shared symbol versioning as per https://stackoverflow.com/a/73388939 (patchelf --clear-symbol-version exp mybinary)

2. Replace libc.so with a fake library that has the right version symbol with a version script e.g. version.map GLIBC_2.29 { global: *; };

With an empty fake_libc.c `gcc -shared -fPIC -Wl,--version-script=version.map,-soname,libc.so.6 -o libc.so.6 fake_libc.c`

3. Hope that you can still point the symbols back to the real libc (either by writing a giant pile of dlsym C code, or some other way, I'm unclear on this part)

Ideally glibc would stop checking the version if it's not actually marked as needed by any symbol, not sure why it doesn't (technically it's the same thing normally, so performance?).


Ah you can use https://github.com/NixOS/patchelf/pull/564

So you can do e.g. `patchelf --remove-needed-version libm.so.6 GLIBC_2.29 ./mybinary` instead of replacing glibc wholesale (step 2 and 3) and assuming all of used glibc by the executable is ABI compatible this will just work (it's worked for a small binary for me, YMMV).


> When you get into lower power, anything lower than Steam Deck, I think you’ll find that there’s an Arm chip that maybe is competitive with x86 offerings in that segment.

At which point does this pay off the emulation overhead? Fex has a lot of work to do to bridge two ISAs while going through the black box of compiler output of assembly, right?


afaia emulators like Fex are within 30 to 70% of native performance. On the fringes worse or better. But overall emulation seems totally fine. Plus emulator technology in general could be used for binary optimization rather than strict mappings, opening up space for more optimization.


See also "Parse, don't validate (2019)" [0]

[0] https://news.ycombinator.com/item?id=41031585



The Times is more or less lying here.

https://www.judiciary.uk/wp-content/uploads/2025/08/Wikimedi...

> On 18 March 2024, the Secretary of State was provided with a Submission which made it clear that Category 1 duties were not primarily aimed at pornographic content or the protection of children (which were dealt with by other parts of the Act).

Notice this is under Sunak, not Starmer. The Times chooses when to support and opposite the Online Safety Act based on which party is in government, and provides evidence for its view by lying through omission.

The Online Safety Act is undeniably terrible legislation, but you won't find good-faith criticism of it from the Times.


To anyone who does keyboard handling, please don't use KeyboardEvent.code like this site does unless you have a reason to ignore all users that use a different keyboard layout: https://developer.mozilla.org/en-US/docs/Web/API/KeyboardEve...

> Warning: This ignores the user's keyboard layout


This is true, but there is a subtle point that key K1 used for the classical algorithm must be statistically independent of key K2.

If they're not, you could end up where second algorithm is correlated with the first in some way and they cancel each other out. (Toy example: suppose K1 == K2 and the algorithms are OneTimePad and InvOneTimePad, they'd just cancel out to give the null encryption algorithm. More realistically, if I cryptographically break K2 from the outer encryption and K1 came from the same seed it might be easier to find.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: