Hacker Newsnew | past | comments | ask | show | jobs | submit | vasilakisfil's commentslogin

you are right, not so useful for Result type, but still, it comes handy for Option types (since None doesn't hold anything).


Agree, I don't know if it's that useful for `Result<T>`, but for `Option<T>`, there has been a couple of times I've written

  if foo.is_none() {
    return;
  }
  let foo = foo.unwrap()
Now I can do simply

  let Some(foo_unwrapped) = foo else {
    return;
  }
which is prettier than the `if let (...)` to just unwrap it IMO.


Without let-else, you could write that as:

    let foo_unwrapped = match foo {
      Some(foo_unwrapped) => foo_unwrapped,
      None => return,
    };
Not as pretty, but you don't have to unwrap.


For Result it's probably more common to use ?, alternatively you could use

    let x_result = something;
    let Ok(x) = x_result else {
        ...
    }
But I expect that it will be used mostly with Option, as in "else continue" or "else break".


I skimmed through their docs but couldn't find whether they:

* support for abstract apps, not only HTTP/web apps like heroku, let's say I want to deploy a SIP app

* support for HTTP/2 and potentially HTTP/3

If they do support these two I would say it's enough to be considered Heroku killer.


You can route arbitrary TCP or UDP services through our load balancing layer just fine. I'm not sure what SIP actually needs, but it might work. We don't currently have a way to route TCP connections directly to individual VMs, so stuff like WebRTC doesn't work.

We do support HTTP/2.


it's so funny how everytime services are down, their status page is like everything is normal here. I understand that there are internal politics around such things but still...


I wish I could use one of those, I had one actually some years ago but my eyes were burning (I have huge fonts compared to average people in my regular 17inch laptop).


Usually you need redis to scale up horizontally, meaning with multiple servers/processes, and still keep consistency in your data. In general redis is a very useful tool that provides various features, even pub sub.


> Returning a 2xx code immediately tells the client that the HTTP response contains a payload that they can parse to determine the outcome of the business/domain request.

There is nothing in the protocol that mandates only 2xx status codes are parsable. Instead, the Content-Type pinpoints to the client what kind of content the body has, regardless the status code. And the content-type will most probably be what the client asked (if not, then the server announces that the response content is not subject to negotiation, something that you see rarely nowadays..)

In general I think this post reflects one of the mentalities that appeared around 10 years ago regarding HTTP APIs (another was the extreme hypermedia-driven APIs). But I really think nowadays we can do much better, we can deliver a practical API that works 99% of the times.

By the way in the latest HTTP RFC (RFC9110) status code 422 is finally officially defined (previously was part of webdav extention), which means that the war 422 vs 400 for validation and other semantic errors is over and we have a clear winner. But 200 vs 404 for something that was not found? Well that was has ended long ago I think..


I think 404 being a common and useful server error is the issue. Had they/REST aligned on 204 No Content (for things like the employees/100 example) or something 2xx and less common I think it wouldn’t be much of an issue at all. I still think it’s actually not much of an issue. Of all the quirks out there, this creates little pain


> Had they/REST aligned on 204 No Content (for things like the employees/100 example)

Who is “they/REST”, and where have “they/REST” aligned on anything. All REST says is “use the underlying protocol without messing with its semantics, but only the parts that correspond to REST semantics”.

If the resource exists and is accurately represented by no data, then 204 fits for a gET, but that’s a different case than where there is no matching resource to the URL. 204 is mostly for modifying actions (e.g., PUT, DELETE) that don’t need to return more than the fact of success.


I've never seen a response indicating the content is not subject to negotiation. I generally only see that the response is not acceptable (i.e. the client has indicated they can't accept it) and the server skips processing the request.


> There is nothing in the protocol that mandates only 2xx status codes are parsable

I think a kinder reading of this point is "a 2xx response means you can parse the thing you were expecting to parse"


That should be indicated by the Content-Type header, not the status code. If you get a 2XX response but the Content-Type isn't what you expect, you probably shouldn't try to parse it. I've seen misbehaving APIs return 2XX when they really should've returned 503.

Often this it to get around the inability for some cloud-based load balancers to accept 5XX status codes as healthy responses. Take AWS ELB/ALB, for example. There are conditions under which a target instance knows the underlying issue isn't related to its own health, such as during planned maintenance, upgrades, or database failures. In these situations, it would be desirable to return 503 and configure ELB/ALB to accept that as a healthy response. Since AWS won't let you do that, some applications will just return an empty or bogus 200 response during upgrades or maintenance.


Ah, you got me on that one, fair point.

But then it would be both, wouldn't it?

I know I have a JSON payload representing a domain response because of the 2xx response code AND the Content-Type header?


You have to validate that the response is well-formed even if you parse it. There's no harm in trying to parse it if the Content-Type indicates it's JSON (or XML, or whatever else you're expecting). You can then use that result--or lack thereof, in case you couldn't parse it--to determine why you got a particular status code.

If a resource isn't found for any reason, 404, 410, or 451 is the correct response. If you want to clarify why it's not found, that should be included in the response body. Don't return 200 while simultaneously reporting an error--that's just bad form. 2XX means everything is good, 4XX means problem on my end, 5XX means problem on the API's end. It's an easy way to tell at a glance who's likely at fault. Yes, status codes are always going to be ambiguous, but that's why there are response bodies alongside them. If the Content-Type header is something you recognize, you can at least attempt to automate that disambiguation process.


> If a resource isn’t found for any reason, 404, 410, or 451 is the correct response.

Nitpick, but 421 should be on this list, although the circumstances where you would need this should be extremely rare.


> I think a kinder reading of this point is “a 2xx response means you can parse the thing you were expecting to parse”

It doesn’t though.

Even with an Accept header on the request, it is not impermissible for a server to return a 2xx result with a different format indicated by the Content-Type header if it cannot, for whatever reason, return a result in a requested format (it may also, consistent with the HTTP spec, return a 406 Not Acceptable or, if it wants to be cagey about a resource representation existing if there is none matching the Accept header, 404 instead.)

If you want to know whether and how you can safely parse the response, the HTTP compliant way is to read the Content-Type header. Otherwise, you are relying on assumptions (which may be valid for out-of-band, non-HTTP reasons) about behavior that are outside of the spec.


This is a whole lot of nice sounding theory but ultimately in practice this is just a downright mess to handle in a real application calling the API. If you are using the Angular httpClient for example, a 404 immediately throws an observable error when your app really should be telling the user that there are no results for that query. This crosswires a potential server-level error (broken routing etc) with a request level error in error handling and would make it way harder to determine the cause of the error and lead a dev to just write `status.code ==="404" ? 'User does not exist': ....`

Did I mention that httpClient, by default, doesn't let you get 404 error bodies?

But ultimately, it is all just ideas on 'how neat it would it be to use codes!' when in practice it is so so much better to just drop it and use the codes for more literal values. Imagining a users/{X} as a 404 for invalid 'X' is fun..but like..the server actually defines it as something like /users/:userId and it isn't actually an invalid route and will not be caught by the default 404 handling. It's a conceptual dance.


> Did I mention that httpClient, by default, doesn't let you get 404 error bodies?

Why would a http client not let you get http response bodies for statuses that usually send bodies? I could understand it for a 201, and definitely for a 204, but for a 404 it just seems like bad design of the client.


What I am hearing is “Angular httpClient is badly defined”, which is the kind of risk you run into a lot with big monolithic highly-opinionated frameworks.


jQuery gives you the body. ExtJS gives you the body. Webix gives you the body. Maybe this is an Angular problem?


Don't get me started on HTTP 300 support


> By the way in the latest HTTP RFC (RFC9110) status code 422 is finally officially defined (previously was part of webdav extention), which means that the war 422 vs 400 for validation and other semantic errors is over and we have a clear winner.

Unfortunately, the way 422 is written implies that the sent body/content has error(s) and not the header. It's close, but I still feel that for GET requests it's wrong.


The standard says nothing about bodies:

> The 422 (Unprocessable Entity) status code means the server understands the content type of the request entity (hence a 415(Unsupported Media Type) status code is inappropriate), and the syntax of the request entity is correct (thus a 400 (Bad Request) status code is inappropriate) but was unable to process the contained instructions. For example, this error condition may occur if an XML request body contains well-formed (i.e., syntactically correct), but semantically erroneous, XML instructions.

https://datatracker.ietf.org/doc/html/rfc4918#section-11.2

Additionally, bodies are allowed on GET requests by the standard though are not commonly used because of bad middleboxes. However, many GET requests include query params or other parts of the url to be parsed, and its completely reasonable interpretation of the standard to return 422 if those are not well-formed according to application rules.


I can't believe that you haven't read the f***ing standard:

"HTTP messages often transfer a complete or partial representation as the message "content": a stream of octets sent after the header section, as delineated by the message framing." - RFC 9110 (https://www.rfc-editor.org/rfc/rfc9110.html#name-content)

Earlier standards even used the "body" and "content" in the same context:

"The presence of a message body in a request is signaled by a Content-Length or Transfer-Encoding header field. Request message framing is independent of method semantics, even if the method does not define any use for a message body. ... When a message does not have a Transfer-Encoding header field, a Content-Length header field can provide the anticipated size, as a decimal number of octets, for a potential payload body. For messages that do include a payload body, the Content-Length field-value provides the framing information necessary for determining where the body (and message) ends." - RFC 7230 (https://www.rfc-editor.org/rfc/rfc7230#section-3.3)

This is probably due to historical reasons - MIME (email standard) uses "content", the original draft uses "body".


On that earlier standard part I wouldn't agree here, the fact a "Content-length" field is used doesn't necessarily give a glossary definition for "content". But I'll be damned, you're absolutely correct, RFC 9110 leaves no room for doubt. This is huge and leaves a terrible taste in the mouth, the word content is far too common to imprint a definition like this. Were the people who wrote every single occurrence of "content" in the standard aware of this? Should we take guesses and interpret HTTP 422 for example in spirit of law? This just makes ambiguity and error far too likely in my opinion, it's appalling to see it in such an important standard without any further explanation at all.


rfc4918 Also doesn't use the verbiage "content" it in fact says "contained instructions" and "request entity" which are not defined terms in either RFC you linked.

> and the syntax of the request entity is correct (thus a 400 (Bad Request) status code is inappropriate) but was unable to process the contained instructions.

Emphasis mine


There's nothing in the protocol, but in general, you should not assume non-2xx status codes have parseable payloads (more specifically, you should not assume the format of a non-200 status code).

Reason being that any intermediary step in the chain of resolving your request can return the response. If your load balancer has fallen over? You'll get a 500 with a plaintext body. Don't parse that as JSON.

(Technically, any intermediary step might also return you a 2xx with a non-parseable body, but that scenario is far, far more rare... Mostly see it in completely misconfigured servers where instead of getting the service you're trying to talk to you get the general "Set up your site" landing page).


> you should not assume the format of a non-200 status code

You should never assume format based on status code at all! You should detect it based on the Content-Type header.

> You'll get a 500 with a plaintext body. Don't parse that as JSON.

Any intermediary which returns plain text with an application/json Content-Type is badly, badly broken.


> You should detect it based on the Content-Type header

In approximately a decade of working on JavaScript and TypeScript UI code, I can count on one hand the number of RPC handlers I've seen that inspect Content-Type in the success codepath.

... for that matter, I can count on one hand the number of RPC handlers I've see nthat inspect Content-Type in the failure codepath also. The common behavior is to dump body to text, log error, and recover.


> There’s nothing in the protocol, but in general, you should not assume non-2xx status codes have parseable payloads (more specifically, you should not assume the format of a non-200 status code).

There’s absolutely something in the protocol, and you should absolutely use the Content-Type header to determine whether there is a body and whether it is in a parseable format irrespective of the status code, except for the minority of status codes defined as not having a body (e.g., 204 & 205.)


Unfortunately real-world use is `Content-Type: application/json` for everything.


> There is nothing in the protocol that mandates only 2xx status codes are parsable.

I think my overly defensive view of this for real-life code is that error states are inherently situations where the normal code contracts break down and that I must make fewer assumptions about the response, for example that they are well-formed or even match the requested content-type.

The number of times that I've encountered a JSON API that can suddenly return HTML during error states or outages is too damn high. So unless you give me a 2xx I'm not going to assume I got something intelligible.


> I must make fewer assumptions about the response, for example that they are well-formed or even match the requested content-type.

I think that you should always assume that an HTTP response is a well-formed HTTP response (otherwise you can't even trust that the 404 itself is correct); and you should never assume that the received MIME type is the same as the ones you indicated you accepted; you should always check the Content-type header.


I'd say most web APIs I've used or developed recently return JSON even with 4xx or 5xx error codes. What can be annoying is knowing what JSON schema to parse with depending on the status code, as not even the Content-Type header will tell you that. APIs (esp. those behind load balancers) that sometimes return HTML and sometimes JSON are far too common though - the problem there is that the JSON responses are appropriate for programmatic consumption (terse/semantically well defined) where HTML ones typically aren't. But even if the Content-Type header is abused (application/octet-stream anyone?) it's not hard to write code that copes with either. One API I used recently returned JSON in some cases and CSV format in others, with no distinction between status code OR content type!


Serialization/deserialization is such an important part of web development, I have no idea why Rails includes the ancient JBuilder (and very slow since it goes through templating) library, instead of investing in a proper library. Let alone deserializing which is equally important..

I think the API Shale provides is pretty sane. I would probably use it in my next Ruby/Rails project. I don't like the fact that Nokogiri is included by default, it would be nice to declare a core type, and then bring in what you need (JSON, XML, YAML) as a different gem. But that's not a deal breaker for me.

I have created my own serializers in the past (SimpleAMS[1]) because I really detested AMS, no offence to AMS contributors, but AMS library should just die. Rails, and way more importantly Ruby, should come up with an "official" serializers/deserializers library that is flexible enough, rock solid and fast. For instance I had done some benchmarking among common serializer libraries [2] and AMS was crazy slow, without providing much flexibility, really (meaning, slowness is not justified). Others were faster, but were supporting only one JSON spec format (like jsonapi-rb). I am wondering where shale stands.

Another thing is that most serialization libraries seem to have ActiveSupport as a main dependency (not shale though) which I think is a bit too much, and actually has a performance hit on the methods it provides.

I really think that Ruby community can do better here ?

[1] https://github.com/vasilakisfil/SimpleAMS

[2] https://vasilakisfil.social/blog/2020/01/20/modern-ruby-seri... (scroll towards the end for benchmarks)


I'm glad you like it. One clarification - Nokogiri is not required by default, you have to explicitly require "shale/adapter/nokogiri" to use it. If you don't Shale will use REXML which comes from Ruby's standard library.


Rexml has been gemified. Shale's gemspec doesn't require a specific version of rexml and rexml<3.2.5 is vulnerable to CVE-2021-28965. I just checked Ubuntu 20.04 LTS and got Ruby 2.7 with rexml 3.2.3 by default so this seems like a realistic concern and it would be safer if shale required a minimum rexml version.

See http://www.ruby-lang.org/en/news/2021/04/05/xml-round-trip-v...


I have a mixed feelings about this, standard library's vulnerabilities are part of Ruby's vulnerabilities, so you would update your Ruby version anyway. But you're right specifing version explicitly would prevent this.


I think one of the motivations for splitting the stdlib into gems was for exactly for this kind of scenario: some users might not be able to update their Ruby immediately. The ruby-lang advisory explicitly recommends bumping the REXML version.


I have definitely been in situations where I couldn't update the ruby version in a timely manner, but have been able to bump a gem version (like in this example)


If I get a dependabot alarm for my Rails project, I would do well to make a bet that it's a nokogiri vulnerability. I haven't looked into the "why" or what's really going on, but it does feel like there's a lot of room to look at attack surface or any core design issues.


Nokogiri is one of the most security-sensitive parts of any Rails codebase, since it's used for parsing and sanitizing untrusted HTML and XML documents. Accordingly, there's a lot of scrutiny on it (and its upstream dependency, libxml2). That said, as far as I'm aware, almost all of the recent vulnerabilities I've noticed have been related to XSLT and other obscure XML features that most people probably don't use (and aren't enabled by default). So there's a combination of both 1) lots of scrutiny on the library itself leads to high security standards and 2) the goal of fully-featured XML processing adds a large attack surface that may not be relevant to most people that leads to a lot of vulnerability alerts.

Personally though, I've been seeing almost 10x the amount of alerts for useless "vulnerabilities" like ReDOS in nodejs projects though. Either way, alert fatigue is real.


The last one was about libxslt... which I'd be shocked if anyone is using XSLT in a production environment that is also actively maintained.


XML is chock-full of misfeatures ripe for creating security vulnerabilities. It's not just nokogiri – XML parsing libs are one of the hottest sources of vulnerability notifications in many ecosystems (a large number of those CVE alerts come by way of using libxml2 under the hood, which nokogiri also depends on).

Safely parsing untrusted XML is an extremely hairy task.


nice! I use that quite often, so it might come handy. Quick question, how did you build that ? Does Google provide any sort of API or did you build a web scraper for that ?


If you look at the source you can see they literally launch a chrome instance in a lambda and google your query with site:reddit.com prefixed


while idk how that was built, there are a few options on doing same:

- scraping, probably against TOU - api -> https://programmablesearchengine.google.com/about/ - same, except embed -> https://programmablesearchengine.google.com/about/

context - we use all of those plus Bing, etc. for https://breezethat.com


Nom is a fantastic library. I have built a SIP library [1] on top of Nom, no way I would have built that without Nom's help, and even if I did, it would be a heck of a mess and under-optimized code.

[1]: https://github.com/vasilakisfil/rsip


Location: Stockholm, Sweden & Athens, Greece

Remote: Yes

Willing to relocate: Maybe

Technologies: Rust, Ruby, Javascript, PostgreSQL, ElasticSearch, Redis, AWS, Networks, APIs, Voip, SIP, SDP, WebRTC

Résumé/CV: https://vasilakisfil.social/public/FilipposVasilakis.pdf

Email: vasilakisfil@gmail.com

Among others:

* HTTP/REST APIs specialist, pure interest in networked services and optimizations

* Good knowledge of CAS/SAML/OAuth2 and related auth protocols

* Good knowledge of VOIP/SIP/SDP/RTP/WebRTC and related protocols

* Good knowledge of Distributed Systems theory, concepts & principles

Looking for a permanent role, but open to consultancy as well.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: