I don't understand why this is such a complicated category and that many platfor...

defanor · on Jan 10, 2020

I blame the bloat.

In some high-level languages even BSD sockets (and many other POSIX functions) aren't in the standard library, and there are various wrappers to provide "ease of use" and integration with a language's runtime system; plenty of complexity (and alternatives) even at that point.

RFC 2616 (HTTP/1.1, 1999) may seem manageable, but it's much more than just posting data and getting a response, and IME many programmers working with HTTP aren't familiar with all of its functionality. Then add TLS with SNI, cookies, CORS, WebSocket protocol, various authentication methods, try to wrap it into a nice API in a given language and not introduce too many bugs, and it's rather far from trivial. But that's just HTTP/1.1 with common[ly expected] extensions.

Edit: Though I think it'd also be controversial to add support for particular higher-level protocols into standard libraries of general-purpose languages, even if it was easy to implement and to come up with a nice API.

dragonwriter · on Jan 10, 2020

> RFC 2616 (HTTP/1.1, 1999) may seem manageable

But it's probably not, as it's underspecified and ambiguous, which is part of why its been replaced as the HTTP/1.1 spec by RFCs 7230-7237 (2014).

neurostimulant · on Jan 10, 2020

Plenty of reason why it's hard to ship in standard library. Here's some off top of my mind:

- Should the library includes its own CA store, or use the system's CA store? These kind of library often include their own CA store (since they changes often), and httpx seem to use 3rd party lib to handle that (certifi). This is hard to do in a standard library for variety of reasons (users rarely update their python installation, system CA store is not always available/up to date, etc).

- While the http protocol itself is pretty stable, some part of it are still changing overtime. Things like compression types (brotli is gaining traction these days, and we might get a new compression types in the future), new http headers added, etc. Security issue also show up all the time. The user will want tighter release schedule than python's so they can get these stuff sooner. The situation is even worse for users that stuck in a particular version of python for some reason since they now won't have access to these new update ever.

dragonwriter · on Jan 10, 2020

> Should the library includes its own CA store, or use the system's CA store?

The CA store should be a configurable option, and one of the supported options should be the system CA store.

> The user will want tighter release schedule than python's so they can get these stuff sooner.

Ruby is moving stdlib to default and bundled gems, which addresses this. There's no reason that “delivered with the interpreter” needs to mean “frozen with the interpreter”.

tialaramex · on Jan 11, 2020

> The CA store should be a configurable option, and one of the supported options should be the system CA store.

It's more complicated than that, especially if you aren't on Linux.

On both the other two big general purpose platforms (Mac OS, Windows) the vendor provides a library which implements their preferred trust rules as well as using their trust store.

On Linux what you usually get is the list of Mozilla trusted root CAs and you're left to your own devices. Mozilla's trusted list is IMNSHO a shorter more trustworthy list than supplied by Apple or Microsoft, but it misses nuance.

When Mozilla makes a nuanced trust decision for the Firefox browser that decision doesn't magically reflect in an OpenSSL setup on a Linux server. Say they decide that Safety Corp. CA #4 can be relied upon to put the right dates on things, but its DNS name checks are inadequate and no longer to be trusted after June 2019. Firefox can implement that rule, and distrust sites with an August 2019 cert from Safety Corp. CA #4, while still trusting say, the Safety Corp. CA #4 certificate on Italian eBay from March 2018. But there's no way for your Python code to achieve the same outcome relying on OpenSSL.

Python's key people seem to think that it's better for Python to try to mimic what a "native" web browser would do because that's least surprising. So on Windows a future Python will trust Microsoft's decisions, on macOS they'd be Apple's decisions and only on Linux will it be Mozilla decisions. Today it's Mozilla's trust store everywhere.

Hypothetically in the ideal case you'd have your own PKI and you'd have all the necessary diligence in place, hire your own auditors, maybe even have contractors red-teaming the CA you trust for you - but we don't live in a world anything like that, most people are implicitly reliant on the Web PKI and probably tools like these needs to accept that.

sciurus · on Jan 10, 2020

HTTP/2 was only standardized 4 years ago. HTTP/3 is being actively developed. That's actually not really stable on a language's standard library timespan, IMHO.

takeda · on Jan 10, 2020

It's simple. Python was created before HTTP even existed. Since then a lot of things had changed, and once you create an API, it is hard to rewrite it when new use patterns emerge.

It's easier for a 3rd party package to come up with a better api, because they can start brand new. Also when there's a radical change it is easier for a new 3rd package to take over. Httpx is example of such evolution, although not due to changes in http but this time changes in Python, it makes use of new functionality that's harder to implement in requests, mainly adding async support and type annotations.

viraptor · on Jan 10, 2020

Python was created before json existed, and yet: https://docs.python.org/3/library/json.html

It's not a hard rule, sometimes things do end up in the standard library.

takeda · on Jan 10, 2020

Of course it could be added later (there's urllib that almost no one uses), I meant that Python is older than HTTP and HTTP evolved a lot during that time.

Once you create an API, it's hard to change it. HTTP initially was very simple and evolved over time. Things like REST, JSON encoded messages, cookies, authentication (albeit rarely used), keep-alive were added incrementally. Today's HTTP is used completely different than 30 years ago. Python already had urllib, then urllib2 which was renamed back to urllib in Python 3, but its API is still behind how HTTP is used right now.

guggle · on Jan 10, 2020

some platforms do :) --> https://nim-lang.org/docs/httpclient.html

K0SM0S · on Jan 10, 2020

Indeed :) —> https://golang.org/pkg/net/http/

> Package http provides HTTP client and server implementations.

theon144 · on Jan 11, 2020

Am I missing something?

https://docs.python.org/3/library/http.html

https://docs.python.org/3/library/urllib.request.html

guggle · on Jan 12, 2020

That's good, but that's arguably lower-level than httpx.