Funny they didn't just extend the 32 bit reverse algorithm to 64 bits: https://g...

chrchang523 · on Dec 19, 2022

Yes, this library is of uneven quality, and would benefit from a few hours of focused attention from a specialist. E.g. the various HighestBit functions below what you've called out also look inefficient relative to something using one of the builtin_clz intrinsics, even though there's an earlier use of builtin_clz...

asveikau · on Dec 19, 2022

I think a lot of times the way a header like this comes into being is that someone has a narrow need for an operation that is deemed reusable and abstract. The 64 bit version of a 32 bit bit utility doesn't have a need at the time. Then somebody comes in later trying to fill out more stuff.

rsaxvc · on Dec 19, 2022

I haven't tested this, but on Arm32 platforms, this isn't a bad way to do it n some 32-bit machines.

If the compiler recognizes the 32-bit approach and rbit is available, the 64-bit variant is usually 2 rbits with the registers swapped.

thomasahle · on Dec 19, 2022

If the compiler can recognize the 32-bit version, it really should be able to recognize the 64-bit version too, no?

rsaxvc · on Dec 19, 2022

Ideally yes. Only last year did Clang 13 learn the 32-bit one. GCC doesn't understand either. I haven't checked clang for the 64-bit one.

SideQuark · on Dec 19, 2022

On many platforms a shift by more than 31 bits often did nothing, despite it should do something. Intel, for example, masked the count at 5 bits, breaking a lot of 64 bit compiler hacks.

There is also the issue of cache and speed - expanding to full size would have different footprints, so perhaps they tested that too.

This is likely a result to work around those problems.