Great perspective. Totally agree. We are using nginx to proxy to couchdb and do ...

xrd · on June 7, 2020

This is a great comment.

In all fairness, they did have an auth story, right, and recent documentation suggests they reconsidered that path and now suggest keeping it out of couchdb. So, to me, this says auth was something they never could get right because it is complicated. And I took that a step further to think that it's better to have it outside because you can use any auth solution you want, instead of what the couchdb people felt were the best way to do it (it's smart that they realized they got it wrong and changed course).

I'm confused why everyone here seems to think reverse proxies and auth proxies are complicated. Isn't it the case that all apps of any complexity are a bunch of small services wired together behind a proxy? My auth proxy is all of 50 lines of code, my reverse proxy is 8 lines in nginx conf and it's all held together with a docker compose file that is declarative and works locally as well as on my production server.

lukevp · on June 7, 2020

Thanks for the shout out.

The problem with auth is in the authorization realm, not the authentication realm. It's super easy to do authentication in nginx for example using client certificates, basic auth, api keys, or in your app layer via checking a separate DB or cache for tokens, say via a JWT.

However, it's not trivial to say user "a" owns document "b" but not document "c", and that user "a", in fact, shouldn't even know that document "c" exists at all, while also maintaining the replication that CouchDB has built in.

What we want is one single DB that has all documents, and can replicate all documents on the server level with another DB, but can expose a user-specific changes feed that replicates only data that user has access to. I should be able to grant / revoke access to a document at any time and have it propagate, and all documents should be owned only by the creating user by default.

The choice they are requiring us to make is to shard data ourselves by user (which is unusable in the context of 2 users sharing data), or implement a separate layer that can understand CouchDB replication and can do the filtering of the changes feed as well as the write access to ensure that the documents are restricted correctly.

Take this a step further and now that I have to implement my own authentication AND authorization outside of CouchDB and protect it from the end user directly accessing it, and you could ask the question of why do I even need CouchDB then? Why not just speak CouchDB replication protocol on top of my own auth database and store documents there too?

xrd · on June 7, 2020

I was thinking about my comment, and thought: "I need to ask if you mean authz or authn?" That's funny you said it first.

All those things you note are important, and I just can't see how CouchDB would get those things right inside CouchDB.

My impressions with Firebase/Firestore were that they tried to do that with their "rules" system, and it was a not-quite-JS declarative system that relied on you understanding the non-standard parts of their auth system and the things they exposed. I always felt like this was going to be a huge hole in my app and I would have no way to validate all the edge cases. I feel like this is a complicated beast to do in a generic way and CouchDB was smart to leave it to the app developer, rather than the devops/sysadmin role.

Aren't you being a little stingy with your appreciation for the sync part of CouchDB/PouchDB and a little bit overblown with your worries about understanding the CouchDB "protocol?"

Replication and sync are really challenging problems even if you just think about sharing data in two places, and when you start having your JS code deal with revisions, it gets messy really quickly. That's what appealed to me about PouchDB was never having to really think about sync, other than how to handle conflicts.

But, the CouchDB protocol is just HTTP. And, making a proxy to talk to that is as simple as importing a http proxy module. You are just responsible for your authz logic, which is hard, but at least you can make it exactly the way you want it. I don't really see how mapping your authn logic onto standard HTTP verbs like POST, GET, PUT is that complicated.

Having said all this, you clearly have thought through this stuff deeply, so I'm very interested in hearing your thoughts here because I'm sure my comments are wrong past the surface.

xrd · on June 7, 2020

And, this might be an interesting open source project to collaborate on. I'd be very game for that. xrdawson at the google owned mail system. :)