My particular favourite is how GraphQL servers respond with "200 OK" and the errors will be sent in a key called "errors". Makes regular healthchecks almost useless.
I ended up writing my own service[0] to detect problems with graphql responses, before expanding it to cover websites and web apps too.
I honestly hate that so much, it's a relief to read someone saying the same.
I sort of almost made myself feel a bit better about it by thinking 'no, it's not REST, we have reached the graphql server successfully and got a .. "successful" response from it, it's sort of a "Layer 8" on top of HTTP'. The problem is that none of the bloody tooling is 'Layer 8', so you end up in browser dev tools with all these 200 responses and no idea which ones are errorful. If any.
I mean, I agree. Given the nature of the protocol, it makes sense that a half-successful response of independent queries would still return a 200 on the network protocol.
I actually think I agree with your former self, how do you tell the difference between a server and an application error? How do you tell the difference between "record not found" and "there is no GraphQL endpoint here at all"? Or "you are not allowed to access GraphQL" and "you are not allowed to access the server."
Especially because error responses from your web server layer are usually really different than errors from your backends.
While using HTTP status codes could work for GraphQL payloads which have only one operation in them, this approach would not work for those which have multiple[0].
> GraphQL looks like it's been implemented by someone who thought 200 and 404 were the only possible codes.
Maybe. Or maybe they decided that a 2xx status would be interpreted as "success" by a non-trivial set of libraries and/or systems. Either way, take it up with the standards committee :-).
AWS S3 does the opposite when querying objects that don't exist. If you don't have s3:ListObjects permissions on the bucket you'll get a 403 error (you can't differentiate between the object not existing vs. you don't have access to it).
I think either approach is valid as long as you're consistent. You can make a case for either 404 or 403 when you don't have enough permissions. In GitHub's case you can argue that it's a 404 because the resource does indeed not exist through your auth context. In AWS' case you can argue that a 403 makes sense because you don't have permission to know the answer to your query.
I ended up writing my own service[0] to detect problems with graphql responses, before expanding it to cover websites and web apps too.
-[0]: https://onlineornot.com