Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I gave an un-conference talk at Clojure days Amsterdam called "Purely Functional OS" on this exact topic and we had an incredibly interesting discussion about the topic.

The conclusion we kept coming back to is that it's technically not all that difficult to implement, but that to make it usable in the real world would mean that computers would have to get a lot more cautious about source vs. derived data.

The main thing is that on the one hand it seems pretty feasible to store all meaningful data a user generates over the course of using a computer and building an immutable persistent OS on top of that.

On the other hand though, when you think about doing this in the context of an actual computer as in use today you will very quickly run out of storage space.

What is separating these two scenarios is that a lot of data is (deterministically) derived from source inputs, but in the current state of technology it's impossible to determine what is derived data which can be easily recomputed and what is essential data without which the current state could not be reached.

I happened to just be reading this article on unikernels which coincidentally is something that would help get us a bit further along in that direction: http://queue.acm.org/detail.cfm?id=2566628

P.S. When thinking about this one key concept is to realize that you could can do things like implement a 'torrent file system' where a file is just a torrent hash which can be requested and signal when (parts of it) become available.



in the current state of technology it's impossible to determine what is derived data that can be easily recomputed and what is essential data without which the current state could not be recomputable.

Can you elaborate more on that statement? I see it as an expensive problem, but not impossible.

If you have a completely deterministic VM/instruction set, you can recompute any function output from any set of inputs. If you have non-deterministic input (network, input devices, randomness, etc) you can store those inputs to replay at a later time.

I don't know if it's always necessary to store non-derived data. Do you need to save all keystrokes? All mouse moves? All network packets? Probably not. If those network packets just result in displaying a graph on the screen, they can probably be thrown away. Maybe you can throw away the headers and just compute based on the packet data. Or maybe you can simply store the fact that a stream of data that matches a cryptographic hash was received. It depends on what function receives that non-derived data.

So I think the difficulty would be designing a system that would isolate derived data and replayable/recomputable/refetchable data, and the processes that compute that data -- and do so with reasonable efficiency. I think this could be done at the process level rather than the instruction set level (though you would have to have a restricted instruction set -- i.e. limited floating-point).


At the process level there are already plenty of programs that work this way and it's exactly the kind of thing the Clojure ecosystem is geared towards.

That's why the Purely Functional OS is essentially a thought experiment in what it would take to extend this to the entire operating system.

When I said "impossible given the current state of technology" I meant exactly what you said in your last paragraph though: impossible without designing an OS that isolates derived data.

I believe such a system will come in the form of persistent data structures coupled with reactive programming but, while I believe it's inevitable in the long run, extending that concept all the way to the OS will be quite the challenge.



In my view, the position "get a lot more cautious about source vs. derived data" sort of is a mystery .. and I can only see one thing: temporality. Time-value.

If the CPU has 'now' to consider, and now is Time value [t] and if [t]'s are all there is between states, then 'think about [t1 .. t2]' should be a standard op.

And isn't this the point about immutability, that it has to have had a Time value, to be of any value at all, anyway, to the user?


We actually solved that problem and purely functional distribution exists, its called NixOS


Very cool! "If you're the only one with an idea, it's probably a bad idea" :) Feel free to e-mail me, would be lovely to join forces.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: