But that’s building off ‘8’ as the starting unit. I can definitely imagine a cou...

codeflo · on March 4, 2023

> I can definitely imagine a counterfactual where 12bit bytes became the basic building block.

It's definitely fun to imagine. I personally have a soft spot for a hypothetical ternary architectures, where the "trits" are either -1, 0 or 1. There's a mathematical elegance to that number system that binary can't match.

> 12bit bytes also work great for encoding RGB values for video

Can you elaborate, do you mean historically? The more bits the better, of course -- I get why 12 are better than 8, but it's not clear to me why you would want to stop at 12 bits. And if you don't stop there, what's the advantage?

> The << and >> bitshift operators even on a 64bit number are only using 6 bits of their operand, so even on an 8-bit-centric architecture you’re not realizing any great synergy with the powers of two.

6 bits is still cleaner than the 6.58 bits required to encode shifts for this hypothetical 96-bit architecture, just use the lowest 6 address lines.

And in general I think there are more places where stuff like this comes up than is apparent at first glance. How many bytes in a page? How many pages in a RAM chip? How many address lines and data lines?

In the current era, RAM chips are agnostic about word size (as long as it's a power of two) because what they actually address are these much larger pages that chips can cut up in any way they want. Perhaps if everything standardized to 12 bits as the base at once, you could solve that or find workarounds, but I think would be mathematically tedious at every turn. It's so easy to just chop off a few bits and not worry about it.

jameshart · on March 4, 2023

To elaborate on “12bit bytes also work great for encoding RGB values for video”:

I just mean that it’s easy to encode an RGB value as three four-bit values in a 12-bit number. Aligning ‘memory addresses’ to ‘pixels’ would simplify video hardware. Historically, 4096-color modes based on 12-bits-per-pixel were used (notably on the Amiga), in spite of byte-alignment issues. And of course 24-bit pixels would map to the common 16m colors we all know and love. 8, 16 and 32 bit architectures have always had compromises of one sort or another for storing three color channels - I half feel like the ubiquity of ARGB in modern graphics is down to our finally accepting that storage and memory are now cheap enough we should stop worrying about waste, pad out pixels with another channel, and just find a use for it. Transparency? Sure, why not.

Regarding how RAM is laid out… a lot of early computers used ram chips striped by bit - so they’d have eight ram chips all wired to the same address bus, with each responsible for storing one data line for each address. Twelve bit memory is just laying four more data lines and adding four more chips.

In my hypothetical counterfactual universe, obviously if 12 bit bytes win, later memory architectures are built around 12/24/48/96 bit data buses, so whatever paging and slicing they do would be within that model.

codeflo · on March 5, 2023

Interesting. Also, 8 bits per channel isn't even a natural endpoint. 256 values aren't enough to cover the full range of human perception to any reasonable accuracy, which is why 16-bit floating point per channel is now sometimes used to encode HDR. (But to be clear, at current RAM sizes, it wouldn't really matter what the base is.)

Dylan16807 · on March 5, 2023

RAM chips already use weird numbers of bits all over. For example, DDR4 has 3 bits for chip select, 2 bits for bank group, 2 bits for bank, 18 bits for row, and 10 bits for column. This change wouldn't affect them in any meaningful way.

And in the old days you had arbitrary numbers of address bits too, based on your chip size. I don't see any problems.

Besides, RAM is generally a bunch of chips in parallel, so you could use our world's chips with no change. For example, put 6 on a stick instead of 8.

digitailor · on March 4, 2023

This is exactly what happened in digital audio signal processing and recording, where word size represents amplitude. 12-bit audio was the first word size that provided a pretty good noise floor by the late 1980s, a real improvement over 8-bit. And by the mid-80s the CD format was already providing 16-bits for playback, which really is good enough for most playback scenarios. The 16-bit DSP era was just a few years longer than the 12-bit and quickly gave way to 24-bit, which provides a noise floor good enough for almost anything audio processing and recording related and is still the standard after more than 20 years. I have gear that defaults to 32-bit now, obviously just for power-of-2 convenience in software dev, which is annoying because the file sizes are bigger for basically no reason.

(The master buses in DAWs and digital hardware use even larger word lengths these days, but it's not really the same thing, that's a summing and calculation process)

Dylan16807 · on March 4, 2023

> maybe now we’d be talking about the end of 48-bit systems as the 96-bit processors start coming to market?

Notably, all our CPUs have had 48-bit memory addresses since the introduction of 64 bit chips, and chips that bump up to 57 are in the middle of being introduced.

But in part they're bumping that up because it's easy. The 48 bit mode was an optimization, and they left room.

With 48 bits as a hard limit of what fits into a register, and that representing 256 terabytes / 384 tera-octets of memory, I suspect we would stick with 48 bits for many more years.

zamadatix · on March 5, 2023

Notably x86-64 canonical form uses the full 64 bit address when referring to memory and that's what's in the registers it just doesn't handle addresses in the middle of all places (i.e. it counts addresses in two halves, one half from FFFF... down and one half from 0000... up, not just the first 48 bits of addresses).

Dylan16807 · on March 5, 2023

I'd say it does use just the first 48 bits, but in a sign-extended way. You could sign-extend to any size you want, even beyond 64 bits.

And it forces the program to do the sign extension, so it'll be forwards-compatible with chips that use more bits. But once those bits are verified they get immediately discarded.

zamadatix · on March 5, 2023

The method would scale to any number of bits but the actual registers and addresses passed to the MMU really are 64 bit in real processors, as in you can actually set it to some other non-sign extended 48 bit value with your program today and you’ll just get an MMU exception when you go to use it. After all the CPU doesn’t know something in a register is to be treated as a memory address until after the value is already loaded into the register.

The advantage of 48 but addressing in x86-64 is fewer levels of page lookups (speed up), not abbreviated addresses.

Dylan16807 · on March 5, 2023

> as in you can actually set it to some other non-sign extended 48 bit value with your program today and you’ll just get an MMU exception when you go to use it

You get an error because you didn't sign extend. That doesn't tell you anything concrete about the address size.

> After all the CPU doesn’t know something in a register is to be treated as a memory address until after the value is already loaded into the register.

> The advantage of 48 but addressing in x86-64 is fewer levels of page lookups (speed up), not abbreviated addresses.

If you're taking advantage of the address size in your program, you do only load 48 bits into your register from memory. Usually this is inside of a larger struct, or you've partitioned a 64 bit word into address plus flags.

One method that gets significant use is NaN boxing for dynamically typed values. Double precision floats are stored as-is, while other types of value including multiple types of pointer are squeezed into the 53 unused bits of a NaN.

zamadatix · on March 5, 2023

You can do “mov rax, [0xFFFF…FF]” or “mov eax, 0x1234…FF” and it each case you have loaded either a 64 bit or 32 bit value into the register (depending which memory mode you’re using). Saying it came from a struct or larger abstract type or has leading 0s does not change the actual hardware and make the register itself 48 bits.

Dylan16807 · on March 5, 2023

You could use a 16 bit load and a 32 bit load. I'm not sure what's faster.

If I make a struct that has a 32 bit field followed by a 48 bit field, I might use a 64 bit load as part of getting the latter. But that doesn't make the field larger. It's an optimization because it's safe to overshoot.

Edit: When I wrote this comment it was replying to a request for a 48 bit load instruction.

zamadatix · on March 5, 2023

You can’t do a 32 bit load followed by a 16 bit load, the memory address is in one register so the only option is a 64 bit value be stored there and you need to know what the first 16 bits are to see if the last 48 bits refers to the high or low portion. It’s trivial to check it’s passed all 64 since things like “0xFF0F…” gives an exception but adding “0x00F0…” to the same register and doing the load/call does not.

It’s not a matter of what wizardry you can do in the high level languages to act like it’s a true 48 bits for the sake of speed/size in your data structures it’s a matter of what happens when the rubber hits the road and the CPU loads the address. One thing you can really do on x86 CPUs is run them in 64 bit mode (i.e. get the newer instructions and registers) but only use 32 bit addressing. This does actually go faster do to halving the address sizes, some examples can be found in the Linux kernel as the x32 ABI.

Dylan16807 · on March 5, 2023

> You can’t do a 32 bit load followed by a 16 bit load, the memory address is in one register so the only option is a 64 bit value be stored there

Putting the bits together into one register and doing the sign extension is a matter of arithmetic. It doesn't touch memory any more. You very much can do a 32 bit load and a 16 bit load and no other loads.

> you need to know what the first 16 bits are to see if the last 48 bits refers to the high or low portion

No you don't. That's not how the sign extending works. The top seventeen bits of the 64-bit value are all the same.

> what happens when the rubber hits the road and the CPU loads the address

The CPU will fetch an entire cache line, 128 bytes, but you only need to pull 6 bytes out of L1.

Then you can sign extend the upper bits and use your address as normal, to load any number of bytes from elsewhere in memory (maybe only one byte if you're feeling feisty).

Also you added the part about the register itself not being 48 bits to your previous comment after I replied, but I've never claimed the register was 48 bits. Just that it's effectively containing a 48 bit value. The reason I keep hammering on about sign extension is that you could put the same memory addresses into an 80 bit register, but it wouldn't make them 80 bit addresses. You're just using a register of arbitrary size to store a 48 bit signed number.

zamadatix · on March 5, 2023

> Putting the bits together into one register and doing the sign extension is a matter of arithmetic. It doesn't touch memory any more. You very much can do a 32 bit load and a 16 bit load and no other loads.

There is no 48 bit register nor can you make a 48 bit register via a 32 and 16 bit load, you'd still have a 64 bit register you wrote 48 bits into it's just you've assumed you're going to be using only the lower half of the mapping so 0x0000... is a valid sign-extension. If you were trying to access the other half of addressable memory that doesn't work though.

> No you don't. That's not how the sign extending works. The top seventeen bits of the 64-bit value are all the same.

The bits are the same if you want it to work but it's not the same thing as the bits being extended for you. You still actually have to load the sign extended bits i.e. if you put 0x0000F22... into the register and tell it to load the memory address it doesn't turn the memory address into 0xFFFFF22... for you. Your compiler or OS might be doing that on your behalf but the CPU does not. You will get an exception from the MMU if the memory address is not already sign extended when referenced. It may appears to work that way if you only ever work in user space where 0x0000... happens to extend the lower half for you if you don't set it but only half the memory is mapped in a way that holds true and it's still passing those 0's to the MMU (as again is evident if you set one of the bits to something else then try to call the address).

I think the rest follows with the above. It's not a matter of theory just create an assembly program that tries to load rax set to 0xFF0F...F20000 and compare it to one that tries to load rax set to 0xFFFF..F20000. One works, the other doesn't. The only explanation is the full value is being sent. If only 48 bits were sent to the MMU and it signed extended automatically 0xFF0F wouldn't create an exception. It's a 64 bit address where only 48 bits worth are legal, that's very different than a 48 bit address.

Dylan16807 · on March 5, 2023

> There is no 48 bit register nor can you make a 48 bit register via a 32 and 16 bit load, you'd still have a 64 bit register you wrote 48 bits into it's just you've assumed you're going to be using only the lower half of the mapping so 0x0000... is a valid sign-extension. If you were trying to access the other half of addressable memory that doesn't work though.

You need to rearrange the bits into place. If you mov/shift the correct way it'll sign extend with 0x0000 or 0xffff as appropriate.

> it's not the same thing as the bits being extended for you

Yes you have to tell the CPU to sign extend. I wasn't trying to say otherwise. But you don't have to load those bits from anywhere. You only have to load 48 bits from memory.

> The only explanation is the full value is being sent. If only 48 bits were sent to the MMU

It's still possible the verification happens pre-MMU, but either way as soon as those bits are verified they get discarded. Only 48 bits are used for any functionality.

> and it signed extended automatically

Well again I wasn't saying that. The program needs to ask for the sign extension.

zamadatix · on March 5, 2023

In your high level logic you can treat memory addresses 17 bits and transform them into 64 bit memory addresses on actual load but that doesn't mean the architecture uses 17 bit memory addresses it just means you're generating the addresses on the fly.

> It's still possible the verification happens pre-MMU

It's also possible there is a teapot in orbit between Earth and Mars but as far as actual x86 CPUs they don't deal with MARs (memory address registers) which would have that kind of logic, which is probably a good thing given all the different memory modes multiplied by the number of registers you can use. There are AGUs in newer x86 CPUs (well at least Intel, I assume in AMD) which offload common memory offset calculations but they take and return the standard register load size.

That only 48 bits are only used for any functionality is a fair description though, assuming we're still excluding newer 57 bit CPUs. The main thing I was taking issue with was your initial statement "With 48 bits as a hard limit of what fits into a register" not how many bits of the register or memory address are functionally meaningful.

Dylan16807 · on March 5, 2023

> The main thing I was taking issue with was your initial statement "With 48 bits as a hard limit of what fits into a register" not how many bits of the register or memory address are functionally meaningful.

Oh, Oh!

That was a hypothetical talking about 48 bit CPUs that grew out of 12 bit CPUs.

That's why I said 384 tera-octets and something we would stick with.

zamadatix · on March 6, 2023

Makes sense :) thanks

nine_k · on March 4, 2023

Historically, there were a number of 24-bit, 36-bit, and 48-bit CPU designs; some modern DSPs still have 24-bit ALUs, IIRC.

Interestingly, all these numbers divide cleanly by 8, except for 36. I never heard about 12-bit CPUs though.

Where octal representation shone was 3-bit fields in some CPU code, especially in the venerable PDP-11. Using octal representation for them was the only sane way.

Coincidentally, much of the UNIX was built on various PDP machines.

Also, early TTYs, that is, actual mechanical teletypes, were often 6-bit, which also sits well with octal encoding.

jameshart · on March 4, 2023

> I never heard about 12-bit CPUs though.

The original article points to the DEC PDP-8. That 12-bit architecture, which built on the PDP-5, also 12-bit, wound up in microprocessor form in the Intersil 6100, and DEC actually tried to build a desktop microcomputer based on that chip - the DECmate (https://en.wikipedia.org/wiki/DECmate).

The 16-bit PDP-11 was literally the result of typical DEC indecision, infighting and hedging in the face of increasing importance of ASCII processing (and ultimately a response to a bunch of their engineers who had seen that writing on the wall leaving and founding Data General).

Like so many things in computer history, if DEC had had its shit together, things would have worked out very differently.