Slightly off topic. What are people doing about user data persistence on the cloud/Docker? Specifically, we are porting a desktop application to the cloud via application streaming technology with Docker, but we would like the user’s data and preferences to not go "poof" when the cloud instance disappears. Ideally, we would like some automagic way to attach, say, the user's dropbox account or the equivalent to the cloud instance. Is anyone working on that problem?
You can use something like Glusterfs and distribute your files or it can hook in to your clod provider and create persistent storage automatically on something like EBS.
When I tried it way back when, he trouble I had with Gluster is that it didn't readily handle nodes randomly leaving and new ones joining. It was more hardware-centric in that if a node left, you were expected to bring that specific node back online. Is that still the case?
I don't have direct experience with any of these, but the Docker approach here is that Docker exposes a Volumes Plugin API[1] which allows third-parties to implement portable volume plugins that achieve the ability for a volume to persist across hosts and across containers.
A bunch of plugins have been implemented[2], but I haven't personally heard any real success stories of people using them, which doesn't necessarily mean such stories don't exist.
On a separate note, many other teams run stateful services that can handle the complete loss of a node's data. For example, it seems popular to run Elasticsearch in Docker, though again I'm still learning about this, pattern, too.
I think this is where the idea of co-locating compute and storage really shines (eg: Joyent/Manta, scaleableinformatics.com) - moving all state to nfs sounds like a great way to widen the performance gap between RAM and permanent storage... Not to mention that if you really want to spread your data and your application, you would require nfs over vpn (unless encrypted nfsv4 actually works now?).
Can you really run a transaction-oriented DB over NFS "in the cloud" with meaningful performance and guaranteed writes to disk in case of network/power failure?