Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How Streak Tracks Down App Engine Datastore Bugs (streak.com)
14 points by alooPotato on Sept 23, 2013 | hide | past | favorite | 8 comments


"A typical scalable database engine like App Engine's Datastore will only look at each entity in isolation. Since the entities are technically fine in isolation, there's no guarantee of consistency from the engine level."

This is not true. A feature in app engine called entity groups were added to work around exactly this type of problem. The Dog entities should have their entity group (parent) set to The DogWalker. This will ensure full read/write consistency among that group of entities.

Combine this with transactions, and cross-entity-group-transactions, and you can ensure that you don't have dangling or incorrect pointers like what you describe.


While its true that App Engine has entity groups and transactions on those entity groups, you are still responsible for ensuring consistency in your application code.

You have to remember set the pointers between the two entities correctly and to do the write in a transaction. Granted this should be easy but mistakes can be made.

Compare this to a tradition relational DB where this consistency is enforced at the database engine level (if you've setup your schema in that way) and it won't let you write inconsistent data.


Whether consistency is ensured at the application level or the database level, mistakes can be made. Someone can forget to set things up in a schema correctly in exactly the same way they can forget to implement things correctly at the application level.

Also, even with relational databases, there are many times when you need to use transactions to prevent certain types of inconsistencies, they don't protect against everything.


Parent-child relationships, though incredibly tempting and useful, have a very serious limitation in app engine datastore.

Any transaction on an entity in a group will cause any other writes to the same 'entity group' to fail. Therefore, if you have a large entity group with lots of writes, this causes lots of contention, and your app then has to handle the expected write failures - which is a pain!


My reply was mainly aimed at the example posted here, really how often will a Dog Walker's set of Dogs change? Keep in mind that transactions can have retries too, so it's not enough for 2 requests to collide, it needs to be entity groups with > 1 write per second for that to be an issue.

But beyond that, there are other app engine tools like task queues, which can make dealing with contention pretty easy.


You are right.

An unrelated query: does anyone know of companies (like Streak.com) using app engine datastore in production? We are newbie programmers, and developing our app on app engine (because it was easy to learn) - but it feels a bit lonely sometimes :)


hey there, I work on the App Engine team. To connect with other App Engine developers, you can check out the Google Group or Stack Overflow: https://groups.google.com/forum/#!forum/google-appengine http://stackoverflow.com/questions/tagged/google-app-engine

And our website has a list of some of our customers at cloud.google.com/customers.


There are lots of companies (Streak.com is one of them) using appengine in production.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: