PEP 594 – Removing dead batteries from Python's standard library

imglorp · on May 23, 2019

Nobody talks much about the original goal of "There should be one-- and preferably only one --obvious way to do it." -- PEP 20 (!)

Back when, I lurved me some Perl TMTOWTDI as much as the next guy--plenty of uses for such a thing in a Swiss army chain saw--but then Python had a good (or at least different) answer to that by reducing the language load and focusing more on the problem.

So tell me, how many string formatters do we have now? Could we please decruft and dump some of them while we're at it? % or f' or .format or whatever, but please just pick one and get the rest out of my face.

https://www.python.org/dev/peps/pep-0020/

https://en.wikipedia.org/wiki/There%27s_more_than_one_way_to...

simonh · on May 23, 2019

It's not that it's never been possible to do things other ways in Python. The obvious way to do stuff is to use the latest module or feature. The obvious way to do string formatting is now f-strings.

imglorp · on May 23, 2019

Until the newer way comes along and then there's n+1 ways.

pas · on May 23, 2019

The new ways come for a reason. And the new ways should respect PEP20, meaning they should work for the old use cases, so that everyone can start using the new ways as soon as they can upgrade to that version.

Now, of course Python is interpreted and packages are shipped as raw source code and this makes things harder for library authors, because were there a compile/transpile step, they could write using the new and shiny obvious ways, all the while providing compiled blobs (or transpiled libs) targeting multiple versions.

wund3rb4r88 · on May 23, 2019

So what reason is there for adding a new string formatting syntax that looks nothing like string formatting syntax used in any other language, and obliges a non-insignificant userbase to adapt?

Technological purity or scratching some philosophical itch would be a poor reason, IMO, given how broad the externalized costs run.

So is it provably easier to reason around the new style? Reduce computing time by an order of magnitude (in the modern era where compute cycles and memory are hella cheap)?

Code is to be written so it’s understandable and useful for humans first, right?

How does a third syntax for a solved usecase offer real value to the user base?

Or are we just being sycophants of so-called experts, peddling some novel “look”. Experts who largely built a rep on first mover advantage, but have since seen that excuse for being deified evaporate as the rest of world learned all the same tricks?

simonh · on May 23, 2019

>that looks nothing like string formatting syntax used in any other language

PHP echo "There are $apples apples and $bananas bananas.";

Ruby puts "I have #{apples} apples"

TCL puts "I have $apples apples."

Typescript console.log(`I have ${apples} apples`);

Python print(f"I have {apples} apples")

Looks pretty similar to me, and if you like understand-ability by humans so much I think f-strings win hands down. But if you disagree, there's plenty of code out there still using % strings or .format() and they're not going anywhere any time soon.

olooney · on May 23, 2019

Python's string format is also very similar to the "double mustache" convention widely used in web development. The following template string:

    "I have {{apples}} apples"

To my certain knowledge this works in Jinja2, Django templates, Vue.js, mustache.js, handlebars.js, and probably many others.

PEP 498 mentions that they did look at other languages to see what was supported:

https://www.python.org/dev/peps/pep-0498/#similar-support-in...

And they link to this Wikipedia article, which lists many examples:

https://en.wikipedia.org/wiki/String_interpolation

Scanning through that, my impression is that "there is nothing new under the sun" and aside from an occasional "$" or "#" prefix, or a willfully arbitrary deviation like ColdFusion, the conventions can all be traced the Bourne shell /bin/sh/ which was released in 1979. Prior to that, the prinf() syntax was presumably the most common, but it's obscure.

Does anyone know if the ${variable} convention was original with the Bourne shell, or if it can be traced back further?

fanf2 · on May 24, 2019

I am surprised to find that the 6th edition shell had only $1 etc positional parameter expansion, and named shell variables are not described in its man page

http://man.cat-v.org/unix-6th/1/sh

dangerbird2 · on May 23, 2019

Python string interpolation uses the same syntax as str.format(), and happens to be shared by c#’s string interpolation

poium99 · on May 23, 2019

Anecdotally, the only language in that list I encounter anymore is Python. TypeScript is a closest second.

I’m rather done with this interpreted, dynamic typed, language and the hell holes they dig us into.

DRY could be applied to more contexts than code logic: stop rewriting language features unless there is more than a subjective win of “looks nicer.” or performance is improved by an order of magnitude.

The conversations the Python community should be having are not “once again we must discuss and consider a solution to a solved problem.”

When that’s the sort of progress the language devs are prioritizing, it’s a sign to me they’re out of ideas or incapable of fixing the bigger flaws.

dang · on May 23, 2019

Could you please stop creating accounts for every few comments you post? We ban accounts that do that. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

HN is a community. Users needn't use their real name, but should have some identity for others to relate to. Otherwise we may as well have no usernames and no community, and that would be a different kind of forum. There are legit uses for throwaways, just not routinely.

Lots more explanation: https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...

SoReadyToHelp · on May 23, 2019

> So what reason is there for adding a new string formatting syntax that looks nothing like string formatting syntax used in any other language

The new syntax is more like other modern languages, not less.

simonh · on May 23, 2019

Sure, but is Python better for having f-strings or worse?

dhuramas · on May 23, 2019

In an enterprise-y environment, it's not always possible to keep upgrading to the latest. So assuming someone creates a project based on f-strings, that would be python 3.6+ only. Whereas using string.format will work across entire python 3+. Is that worth the upgrade- I'm not convinced.

pas · on May 23, 2019

Yes, and? This still means that if someone gets the latest Python the f-strings should be used everywhere, because supposedly they solve every and all problems that the previous ones solved and some.

arshbot · on May 23, 2019

But in reality, you can't ( and shouldn't ) go back and change how the rest of the legacy codebase does things just because there's a new way to do it.

e1ven · on May 23, 2019

Right - They tried dropping legacy compatibility before, and people are still upset 10+ years later.

For something as essential as strings, it makes sense to recommend the "new" way, and continue to support the "old" way.

aeturnum · on May 23, 2019

> how many string formatters do we have now? Could we please decruft and dump some of them while we're at it? % or f' or .format or whatever

To me, there's a difference between "one way" and "one interface." I agree we should depricate %-formatting in strings, but .format / f"" are different interfaces on the same functionality - which seems fine.

aequitas · on May 23, 2019

Except that f-strings are not lazy evaluated so you cannot replace something like:

  log.debug("foo bar %s", foobar)

with fstrings. And you end up with part of your team enforcing f-strings because of cleaner look, the other half .format or '%' strings and the rest annoyed when their code is constantly rejected in reviews because they don't know/care which standard to use where.

aeturnum · on May 23, 2019

My understanding was that you can use f-strings for the thing you outlined above, the f-string composition will just be evaluated as an expression whether or not log.debug is enabled. Personally I'd trade that for the %-string drawbacks.

pfranz · on May 24, 2019

I was reading a recent bug report where they were going back and forth on this. f-strings were so performant there wasn't much difference between lazy evaluation (not evaluating them) and f-strings. The sticking points seemed to be a) if your f-string variable took unreasonably long to compute or b) if evaluating your variable caused an error.

I get all of these arguments. I see formatting errors so often in error handling code I would much prefer they'd get caught during normal execution. I can't see trusting a heavy query to lazy evaluation...or see it coming up all that often.

I see both sides, but really think using f-strings in logging is the right compromise for Python...but I just don't see them dropping c-style formatting.

marcinzm · on May 23, 2019

Languages don't solve organizational issues. If you have a dysfunctional team without leadership then that will always show somewhere. A good team would establish some set of coding standards and then enforce those. Rejecting PRs for personal preference coding standards would result in a talk from the manager.

Caligatio · on May 23, 2019

I don't think something like "foo bar %s".format(foobar) would work either in this situation. I think you're conflating format strings with the lazy evaluation associated with logging.

I am, however, in complete agreement that logging should adopt at least the same formatting used by .format

k4ch0w · on May 23, 2019

Someone linked to Amber Brown's talk and I thought this point would be important for discussion. I think there is something that isn't being considered here when it comes to relying on PyPi.

3PP(Third Party Packages) issues are responsible for a lot of application security vulnerabilities today. Every large enterprise organization has no idea what packages a developer is pulling onto their laptop and into their codebase.

As a security engineer, I like having a core team and a standard library in place that has gone through a long mature process with experienced developers, instead of someone who just git pushes code every night. You have no idea who is on the other end of that push too. It's too hard to keep track of changes and causes us to have to pin packages and versions that have been ok'd for use instead of a new release of Python. You have no guarantee the security engineer who reviewed the code didn't miss anything either.

jahewson · on May 23, 2019

There’s really two separate concepts here: review and the standard library. Why not separate concerns and have a review process that isn’t coupled to the standard library? There’s no reason why any PyPI package can’t have meaningful reviews published.

The question that you’re ultimately seeking the answer to is “what code has been reviewed and which reviewers do I trust?” - lots of ways to solve that.

zwp · on May 23, 2019

> “what code has been reviewed and which reviewers do I trust?”

Thank you for putting this so succinctly. There are other supply-chain problems in decentralizing control of the modules though:

Modules might only have a few people or one person responsible for them. Consequently suspect commits by a malfeasant or compromised (hacked) team member might not be noticed. Maybe this is a variant of the old saw "with enough eyes all bugs are shallow" but my gut feeling is that commits to Python core will get more inspection than those to a small 3rd party module. Today, modules get some review for free, just by being part of stdlib. Even if I know and trust @jdoe I have less assurances that they didn't get phished and their repo tweaked.

Also trusted team members/organizations may change over time. The browser extensions world is the poster child for this, where we've seen not only similarly-named malicious extensions posted to stores but also once-legitimate extensions quietly purchased and subsequently subverted by bad actors.

This is one example (adware: could've been much worse): https://www.bleepingcomputer.com/news/security/-particle-chr...

I like the idea of a review process but I have a hard time imagining a crowd-sourced system that wouldn't get gamed. We have a "dissolution of responsibility" problem: millions of companies rely on these components but have no explicit responsibility of care. Perhaps that needs to change, somehow.

pas · on May 23, 2019

Companies (and anyone interested) could simply pool funds together and use that to ask already trusted and respected auditors to do reviews.

And anyone who pays into the fund should be able to vote on what package to review. (And there should be a weighted lottery, so eventually small contributors' wishes have a chance to get fulfilled.)

For good measure anyone can put this on the blockchain, make a flattr/patreon thing out of this. (Somehow use github sponsoring...) Who knows.

Kalium · on May 23, 2019

> There’s no reason why any PyPI package can’t have meaningful reviews published

You are completely correct. There's no reason at all why any random PyPI package can't have meaningful reviews published. I would go so far as to say that this is true without any changes to the standard library or current processes at all.

With that in mind, I'm thinking about all the various packages my colleages use. I don't think I've seen published reviews for an appreciable fraction of them in any language. This suggests that perhaps enabling reviews might not be the hard problem to solve here.

This thought process also hilights to me that the major advantage of a stdlib is that you have a higher degree of assurance that its contents have seen meaningful review by multiple sets of eyes. It's not just the potential for review that matters, it's the degree of assurance.

anoncake · on May 23, 2019

One reason for this proposal is to save manpower. Separating concerns doesn't help with that.

afiori · on May 23, 2019

But keeping them together does not need a standard library, each python distribution could have a set of blessed packages, those packages could be "unblessed" without breaking applications using them

lmm · on May 23, 2019

If the two tasks require different skillsets (and they do) then separating them could easily mean more people are able and willing to help with one or the other.

kgwgk · on May 23, 2019

It does save manpower for Python maintainers if reviewing non-core libraries becomes someone else’s problem.

anoncake · on May 23, 2019

Either the reviewer is trusted and might as well review the libraries as part of the standard library or they are not trusted and reviews by them are useless.

alexchamberlain · on May 23, 2019

I wonder if something like Arch Linux's main repository vs AUR would work?

tialaramex · on May 23, 2019

I suspect you're gravely over-estimating how much value there is in the dustier "batteries included" modules over an out of tree component for security.

The problem is that security assumptions change. If you're looking at some code in 1995 that accesses an API, it seems reasonable that it disables validation for the SSL library it brings in to access an HTTPS URL. Handling that properly would have been a huge pain, faster is better.

But in 2019 you'd be appalled, how is this thing not depending on certifi and requests? As it is it's wide open to a MitM attack and you'd get blind sided.

A much older example: The C standard library includes a function that's very narrowly useful for copying strings into fixed width buffers with no terminator, called strncpy() and in 1970s Unix I'm sure this was invaluable. But obviously today "copying strings into fixed width buffers with no terminator" is essentially never what you wanted, but people assume based on the name that strncpy() will do some other thing they do want, and insert a security defect into their code.

geezerjay · on May 23, 2019

> A much older example: The C standard library includes a function that's very narrowly useful for copying strings into fixed width buffers with no terminator, called strncpy()

Arguably, what makes C the perfect example is not necessarily its standard library functions per se but the language's official string data type.

geofft · on May 23, 2019

> As a security engineer, I like having a core team and a standard library in place that has gone through a long mature process with experienced developers, instead of someone who just git pushes code every night.

This is an appealing sentiment, but you will get hacked if you rely on it.

Python 2's standard library didn't do SSL certificate verification by default until the fairly controversial PEP 476 from late 2014, which changed the default in a point release of Python 2.7. Through 2014 (and for some time later, given how slow it is to get new upstream releases through distros onto someone's computer), the standard thing to do to use HTTPS securely was to use requests, a module developed by someone who got commit privileges taken away on another of his projects for making multiple releases a day. You were more secure with his code than with the standard library.

I think you have no more idea who's on the other end of pushes to the standard library than who's on the other end of pushes to PyPI. (Which is to say, in part, that you can equally well have an idea of both of these if you put effort into it. I've met the maintainers of several of my third-party dependencies at PyCon.) Something being in the standard library doesn't mean it has a higher class of developer behind it. There's more overhead, but it doesn't mean there's more maturity; many third-party module developers are more experienced (either in their field, if it's something like crypto, or just in general as responsible developers) than standard library developers. And if anything it means that security updates are slower and rarer because the process is more painful.

Consider the argument of the 2009 paper "Security Impact Ratings Considered Harmful" https://arxiv.org/abs/0904.4058 (disclosure: I'm a coauthor). Whether people find a vulnerability important enough to patch has little correlation in practice with how exploitable it is. And the only thing that gets regular attention is the latest development version of the code. If some module was significantly refactored or reimplemented, no upstream developer is looking for bugs in the old version. So, if you want to be safe and you haven't personally both audited and fuzzed the code you're running, you actually want to be running the latest released version of that code, regardless of whether someone stuck a CVSS on the old version yet. I've met very few companies who can upgrade to the latest Python 3.x minor release promptly when it comes out (and several who are still on Python 2!). I've met many companies who can pip install the latest versions of their dependencies without too much trouble, though.

Scarblac · on May 23, 2019

> As a security engineer, I like having a core team and a standard library in place that has gone through a long mature process with experienced developers,

Yes, but do you like funding them? This is all freely downloadable open source software.

newen · on May 24, 2019

Plenty of people fund the Python Software Foundation. Their revenue is around 3 million dollars a year.

ivoflipse · on May 23, 2019

Then maybe those large enterprise organizations should start paying for the long mature process with experienced developers

korethr · on May 22, 2019

Hmm, should perhaps this be worthy of a major version number bump? Yes, it's not as large or as breaking of a change as the 2.7->3.0 transition, but by removing modules from the standard library, backwards compatibility is being broken. Now, perhaps most people don't use those features anymore, but for those people who do use and rely on these modules, a major version number change would be a welcome signal that their code will no longer work with the new version.

But, after the initial pain of the 2.7->3.0 transition, I doubt we'll ever see another major version number jump, even if a logical use of version numbers would merit it.

sago · on May 22, 2019

Python minor versions allow things to be removed and thus 'break backwards compatibility'. There is a process for warning of upcoming deprecation then removing it after that. For modules this is documented in PEP 4 (from 2000). Similar deprecation schedules are used by other important ecosystems such as Django.

We'd be on way more than Python 3 if the major version was bumped for each one.

nitely · on May 23, 2019

This. There are plenty of backward incompatible changes in every minor version bump.

jxramos · on May 23, 2019

I think he’s hoping they’d adopt semantic versioning, probably won’t happen.

https://semver.org/spec/v2.0.0.html

anoncake · on May 23, 2019

It's also a bad idea. Semver doesn't distinguish between removing an API that has been deprecated for a decade and Python 2->3. Or, for that matter, between removing one function and removing large parts of the standard library.

bauerd · on May 23, 2019

Both are changes that will break existing code. SemVer clearly signals this. Why bother with how long something has been deprecated?

anoncake · on May 23, 2019

You don't see a difference between removing a rarely-used module that has been deprecated for a decade, giving developers 10 years to replace it on the one hand and overhauling the entire language, breaking not just some but most code, on the other hand?

I'm sure Python 2.7 broke compatibility, but you don't see people refusing to upgrade from 2.6 ten years after its release.

zwp · on May 23, 2019

Red Hat 6 is supported for another 18 months and still ships with Py 2.6.

Python's slow-moving gently gently approach to breaking changes has not been good for the ecosystem. I'll be glad when 2.x is dead. 7 months 8 days 14 hours... https://pythonclock.org/

anoncake · on May 23, 2019

But that's just an enterprise distribution doing enterprise distribution things. If you don't upgrade in general, of course you don't upgrade Python.

zwp · on May 23, 2019

"just"? From where I'm sitting that's a pretty large chunk of IT.

This has knock-on effects: authors that want to deploy scripts/apps with the minimum fuss will avoid adding deps to whatever /opt-based repo RH ships Python 2.recent (and the hoops you have to jump through to install and activate that). So they remain compatible with 2.6.

All of the other applications and 3rd-party modules shipping with RH 6 are also chained to Py2.6.

Many conservative shops (industry verticals) will refuse to upgrade _anything_ until they absolutely have to. I suspect we live in slightly different IT worlds (lucky you!). This is a problem I see frequently and that's why I'm suggesting Python needs more strict impetus for timely upgrades, not more decade-long opportunities for balkanization and incompatibility.

anoncake · on May 23, 2019

What I mean is not that Red Hat is insignificant, but that it's not special that it does not upgrade Python. Even in non-conservative shops that generally would just use the latest version, using Python 3 instead of 2 was not a no-brainer for a long time.

bauerd · on May 23, 2019

Of course there's a difference, but it's irrelevant here, because no matter how long breaking changes have been announced, they break existing code. I prefer a versioning scheme that signals this.

utborin · on May 23, 2019

I was once a fervent devotee of semver. It's so predictable! It's structured! There's a system! Everyone loves a system, especially engineers.

I've since become disillusioned.

The problem with semver, in my experience, is that it's impossible to predict whether a change will actually break someone else's code. Of course there are certain classes of change that are more likely to cause problems for other people. Changing a function signature, or deleting a function outright, is obviously a breaking change.

But the line between breaking and non-breaking isn't a bright one. Move away from the obvious examples and things start to get murky. Even the humble bugfix can be problematic. What if a client application unwittingly relies on the buggy behavior? Now that fix is breaking for them. Is that a contrived concern? Maybe—though anyone who has written an emulator can attest that this a real problem.

What about a non-breaking feature addition? Let's say the new feature requires some extra branches in a function, but doesn't change the function's interface or behavior for people who don't use the feature. Fine, non-breaking. Now say these branches alter the function's performance, and a client application's batch job that used to run in under an hour now takes four hours. It does run, so it's not "broken," but four hours is an unacceptable runtime to the users. Is that still a non-breaking change?

What these examples demonstrate to me is that semver's breaking/non-breaking change concept is incoherent. It conceives of changes as universally breaking or non-breaking, out of context, but a change can only be breaking or non-breaking in the context of a specific client application. Even the seemingly obvious example of deleting a function is non-breaking for applications that don't use the function!

I think the way we release software reflects that we know this deep down, even if we don't admit it. Imagine how you might handle upgrading a library in an application you've written. The new library is a bugfix release. Do you upgrade the library, push it to production at 100%, and gallop off to lunch without so much as a glance over your shoulder? My guess is no. Personally I'd be running my test suite, reading the library's change log, making a gradual release, and keeping a close watch on instrumentation during and after.

The interactions between software systems are simply too complex, too nuanced, too specific to the particular applications. We put all these safeguards in place and do things cautiously because we've been burnt too many times. And we've learned that in actual fact, the line between breaking and non-breaking doesn't exist.

anoncake · on May 23, 2019

> I prefer a versioning scheme that signals this.

How about <Very breaking changes>.<Breaking changes>.<Bugfixes>? That's what Python already does.

Semver is garbage.

m463 · on May 22, 2019

> but by removing modules from the standard library, backwards compatibility is being broken.

API 101 - there is very little cost to leaving them in, but a hidden major cost to having them disappear, usually for non-developers.

--

By the way, "batteries included" is one of the BEST features of python.

Have you tried to fix something in your house with a "homeowner's toolkit" which is usually something like a hammer, pliers, 2 screwdrivers, a putty knife and a few more basic tools?

It is REALLY tedious, like writing a C program with a few basic tools like stdio and ctypes.

More languages need "batteries included", maybe like Perl.

I think if the cost of deploying a script is 1, deploying it + a dependency is literally something like 100x. You have to make assumptions about all the environments the script will run in, and they are usually wrong.

bigiain · on May 23, 2019

> Have you tried to fix something in your house with a "homeowner's toolkit" which is usually something like a hammer, pliers, 2 screwdrivers, a putty knife and a few more basic tools?

Sure. But did you read the list of proposed deprecations?

It's not like that homeowners toolkit becomes any more useful when it's also got a buggy whip tool, a set of special allen-like keys that only open the case of a TRS80 or a Commodore64, and one of those picks for getting stones out of horses feet.

Most of the proposed things to take out seem obviously "right" to me. AIFF sound file support? MacOS9 binhex? CGI? _Maybe_people are still using those things, but it seems to me like they're edge-case enough to let them know now that they'll need to do "pip install aifc" or "pip install cgi" sometime after late 2021 if they update their python installation (which they can choose to put off until at least 2024 or so, just like all of us who're still sometimes using Python 2.7...)

rlayton2 · on May 23, 2019

Yeah, I went into the article prepared for a "NO! Leave that in!", but I didn't realise just how much very old cruft was in there.

arcticbull · on May 22, 2019

> By the way, "batteries included" is one of the BEST features of python.

None of these libraries need to be nuked from existence as a result of this change. I'd wager they'll move into PyPI modules so that teams relying on them could safely continue to do so.

> Have you tried to fix something in your house with a "homeowner's toolkit" which is usually something like a hammer, pliers, 2 screwdrivers, a putty knife and a few more basic tools? It is REALLY tedious, like writing a C program with a few basic tools like stdio and ctypes.

That's one extreme, but I don't think that's what's being proposed. The proposed model is closer to what Rust does today, where the core is slim, opening up new potential use cases, and the more complex functionality built on top of it is left to the community to maintain.

Take a look over some of the modules they're deprecating, like smtpd. What kind of standard library requires an SMTP daemon built in? That's akin to a homeowners toolkit including a planishing hammer [2] for some reason.

> I think if the cost of deploying a script is 1, deploying it + a dependency is literally something like 100x. You have to make assumptions about all the environments the script will run in, and they are usually wrong.

Depends on how it's done, honestly. Check out this Rust "scripting" system [1]. It has full support for third-party crates.

[1] https://github.com/DanielKeep/cargo-script

[2] https://en.wikipedia.org/wiki/Planishing

m463 · on May 23, 2019

Sorry, my "batteries included" comment was independent of the changes proposed. I just love the rich library python provides (compared to other scripting languages).

arcticbull · on May 23, 2019

I hope you didn't take my reply to be aggressive in any way! Standard library is always a point of contention in any environment and it's always good to explore all angles. It's kind of the ultimate bikeshed in a lot of ways. Python indeed is a very batteries-included language and there's a lot to like about that. I wonder if there's a way of preserving that by differentiating between a 'lite' and 'core' distribution so we can still meet the goals of the embedded / small system community?

u801e · on May 23, 2019

> Take a look over some of the modules they're deprecating, like smtpd. What kind of standard library requires an SMTP daemon built in?

Just because it's not widely used doesn't mean it cannot be included in the stdlib. For instance, where I work, we have a MTA test framework that makes extensive use of smtpd to receive messages processed by our configured MTA which we then write to disk and subsequently make assertions on the message (e.g., contains certain headers, has certain recipients, etc.).

Having various protocols (client and server) in the stdlib is not a bad thing and, personally, I think it's very useful for testing purposes.

yawaramin · on May 23, 2019

You couldn't set up a library dependency on smtpd or similar for your test project?

kllrnohj · on May 22, 2019

> API 101 - there is very little cost to leaving them in

That's not true as newcomers will typically bias towards bundled modules instead of 3rd party ones. If it's well accepted that the bundled modules are bad in some way then you are encouraging further use of them by leaving them in. That's real cost, and not just on the Python maintainers side of thing.

CJefferson · on May 23, 2019

Just how is a beginner, or in fact anyone, supposed to go and choose the "right" 3rd party module, when it seems half the stuff in pip is either half implemented, or even just a hello world example.

anoncake · on May 23, 2019

An official list, containing only one library for each purpose, could solve the issue.

kllrnohj · on May 23, 2019

By simply asking that question you're already more likely to get a better result than if there was a known-wrong choice bundled into the standard library.

But really that's just a tutorial/discovery issue, which has many solutions. And not even a skill isolated to programming - how do you pick the "right" item on Amazon to buy?

m463 · on May 23, 2019

You can hide them or mark them deprecated in such a way that new scripts don't see them, but old scripts still function.

uranusjr · on May 23, 2019

How exactly do you distinguish new and old scripts? They are all just files from the interpreter’s perspective.

m463 · on May 25, 2019

that's a good question.

maybe you could make it explicit for new scripts?

    from __past__ import cruft

labster · on May 23, 2019

> By the way, "batteries included" is one of the BEST features of python. > More languages need "batteries included", maybe like Perl.

Agreed with both points, the lack of batteries makes Lua quite annoying for me personally. Python and Perl are much more comfortable because you don't have to hunt down the most common operations.

Of course, Perl 6 has decided not ship batteries included, replacing batteries with a fusion reactor instead.

c3534l · on May 23, 2019

The approval process for getting libraries okayed for production is a massive pain in the ass, and for good reason. Without a batteries included language, it'd be easier to write all those tools in-house than hope that maybe in two months I can get a library out in production, even though it's just JSON parsing or something. The "this is an official part of the language and held to high standards, guaranteed to be suitable for poduction and needing no approval beyond already being approved to use the language" makes Python way more preferable IRL than languages I love, but don't even include a random number generator in the standard library.

gugagore · on May 22, 2019

C also doesn't have a standard package manager. I think that's the big problem.

Built-in functionality is less important if you can easily use packages.

enedil · on May 22, 2019

It's also quite possible, that the people who use these features use Python 2 anyway.

mjevans · on May 23, 2019

Does this mean we're going to have Python 2 installed on things forever? (Particularly anything that touches NNTP servers, as there's no replacement for that lib)

viraptor · on May 23, 2019

It's simple code. It can be moved to a separate package and imported with the same name.

u801e · on May 23, 2019

If it's simple, then what's the point in removing it? Dealing with standard application level protocols like SMTP, NNTP, FTP, IMAP, POP, and HTTP shouldn't require installing third party packages.

viraptor · on May 23, 2019

So that they can be improved / maintained. "stdlib is where packages go to die" is kind of true. You can use contact http from stdlib for example, but that's how we get urllib, urllib2, 3rd party urllib3 and requests.

For the point of removing the modules, check the "rationale" part of the PEP.

u801e · on May 23, 2019

The original post you responded to mentioned that NNTP did not have a replacement, so I don't see how it could be improved/maintained if it's just not there anymore.

earenndil · on May 23, 2019

Which lib?

anoncake · on May 23, 2019

The one for touching NNTP servers.

anoncake · on May 23, 2019

> But, after the initial pain of the 2.7->3.0 transition, I doubt we'll ever see another major version number jump, even if a logical use of version numbers would merit it.

Do you think the transition would have been less painful if 3.0 had been called 2.8?

dagw · on May 23, 2019

If Python 2.8 had come out and failed to run almost all code written for Python 2.7 then that probably would have been truly disastrous for the language.

rozab · on May 23, 2019

some of these modules have actually been deprecated since before python 3.0

Mirioron · on May 22, 2019

Even if they don't bump up the major version number, people are probably still going to treat it as such. I think it makes sense to bump it up.

avar · on May 22, 2019

Relevant discussion 4 days ago, "Python's batteries are leaking": https://news.ycombinator.com/item?id=19948642

anoncake · on May 23, 2019

> Modules in the standard library are generally favored and seen as the de-facto solution for a problem. A majority of users only pick 3rd party modules to replace a stdlib module, when they have a compelling reason, e.g. lxml instead of xml. The removal of an unmaintained stdlib module increases the chances of a community contributed module to become widely used.

Developers don't have a compelling reason to use 3rd party modules instead of the standard library. Therefore they don't. You consider that a problem and want to encourage them to use 3rd party modules more.

In short, you want developers to use 3rd party libraries because there is no compelling reason to do so?

> A lean and mean standard library benefits platforms with limited resources like devices with just a few hundred kilobyte of storage (e.g. BBC Micro:bit). Python on mobile platforms like BeeWare or WebAssembly (e.g. pyodide) also benefit from reduced download size.

That's a silly reason. Just make a separate distribution with a stripped down standard library.

int_19h · on May 23, 2019

To give a specific example, there's a lot of headache that new Python users could be spared if only they used requests instead of urllib2. But they don't know that it's compelling if they never go look, and just stick to stdlib.

anoncake · on May 23, 2019

Good point. However, there's a box saying that request is recommended at the top of the urllib.request docs.

https://docs.python.org/3.8/library/urllib.request.html#modu...

ben509 · on May 23, 2019

Unfortunately, I think people usually search StackOverflow, or just type into Google and get pointed to SO. If they type in "how do I download a web page in python" they get:

1. urllib [1] 2. BeautifulSoup and a comment mentioning requests [2] 3. requests [3] 4. urllib, httplib [4]

Which is looking better than I expected... but no information on 2 or 3 relating to how you install those libraries. So 1 and 4 will Just Work.

[1]: https://stackoverflow.com/questions/45717889/read-the-text-o... [2]: https://stackoverflow.com/questions/26050064/automating-down... [3]: https://stackoverflow.com/questions/44553348/how-to-download... [4]: https://stackoverflow.com/questions/2646288/retrieve-some-in...

pfranz · on May 24, 2019

I'd love if something like requests was included along with Python. But requests uses urllib3, so it's not like you could get rid of that. The main page for requests says they're working on a v3...you wouldn't want that kind of churn in the stdlib. Personally, I kind of like the structure of low-level libraries and fancier high-level libraries I use most often.

So I'm not quite sure what is the best thing to do. It sounds like requests III is a Python3 forward version. So it might make sense to ship that?

Python has decided to include packages developed outside of the stdlib before; I think json, pathlib, and maybe optparse.

carlmr · on May 23, 2019

>> A lean and mean standard library benefits platforms with limited resources like devices with just a few hundred kilobyte of storage (e.g. BBC Micro:bit). Python on mobile platforms like BeeWare or WebAssembly (e.g. pyodide) also benefit from reduced download size.

>That's a silly reason. Just make a separate distribution with a stripped down standard library.

True, you could have a std and core library. I think Rust does this for embedded. IMHO if you take away the large standard lib, except for pandas I have no more reasons to use Python. I don't like the language that much, but the std library is pretty good for scripting when you need something on a target system that only has standard Python.

hoseja · on May 23, 2019

They want to stop supporting the old modules because to keep them in working order is almost useless and uses too much manpower.

anoncake · on May 23, 2019

That's one of three reasons. https://www.python.org/dev/peps/pep-0594/#rationale

chubot · on May 23, 2019

Since the parser module is documented as deprecated since Python 2.5 and a new parsing technology is planned for 3.9, the parser module is scheduled for removal in 3.9.

Hm interesting comment. Does anybody know what the new approach to parsing Python in 3.9 is ? I searched python-dev@ but couldn't find any references to it.

I found an interesting tidbit about Rust "switching" from LL to LR here:

https://www.reddit.com/r/ProgrammingLanguages/comments/brhdt...

And I noticed some rules in Python's grammar that are awkward in LL parsing (set and dict literals, and comprehensions).

I wonder if those things motivated the switch? They certainly work though.

ivoflipse · on May 23, 2019

They're having the discussion on Discuss: https://discuss.python.org/t/preparing-for-new-python-parsin...

chubot · on May 23, 2019

Thanks a lot! That led me to find this November thread:

https://discuss.python.org/t/switch-pythons-parsing-tech-to-...

And I posted here about it:

https://www.reddit.com/r/ProgrammingLanguages/comments/brz2y...

It looks like the set and dict literals I noticed weren't so much the motivating use cases, but even more fundamentally assignments and keyword args!

scrollaway · on May 23, 2019

Huh interesting. I've been using parsimonious for cases like this, didn't know python had a stdlib for it...

why_only_15 · on May 23, 2019

It appears that the ast module is the recommended one now.

jl6 · on May 23, 2019

> Times have changed. The introduction of the cheese shop (PyPI), setuptools, and later pip, it became simple and straight forward to download and install packages.

I’ve been out of the Python loop for a few years, but my last impression was that packaging and distribution of Python modules was far from a solved problem. Has this changed?

carlmr · on May 23, 2019

No, it's still one of the worst package management systems out there.

I'm always wondering if it has to do with the language and or just the package manager itself. Rust's cargo, Node's npm and probably quite a few others exist that work amazingly well.

bow_ · on May 23, 2019

Why do you think it is the worst? If you don't consider Python, what would you then think is the worst?

Genuinely curious. I use Python daily and I rarely encounter problems with it. There is some getting used to in the beginning, but I went through a similar phase when I started using npm and cargo as well.

What I can say is that I had to go through a lot of experimentation myself to arrive at the tools I use now (pyenv + Poetry). And if anything, maybe the lack of one way that is adopted by everyone in the community is the problem.

dagw · on May 23, 2019

What I can say is that I had to go through a lot of experimentation myself to arrive at the tools I use now

That is, I think, the big problem. Just using pip has never really worked so people have built tools on top of pip, in parallel with pip and replacements for pip. Basically there is no one way to do things.

sametmax · on May 23, 2019

npm, yarn, bower, jamjs, pnpm...

Besides, I have 15 years of Python behind me, I code and train in Python for a living, and I still use just pip.

pip and venv are packaged with recent versions of Python and do the job fine.

bmn__ · on May 23, 2019

Tests are not automatically run and centrally reported as part of the installation process.

There is no good heuristic for picking one package over the other when they occupy a similar problem space. This is mostly a cultural problem, the community's efforts are lackluster.

virtualenv is not installed by default, making bootstrapping into a separate prefix that is independent from the system installation unnecessarily aggravating.

Semiautomatic packaging tools (e.g. pypi into rpm) produce low kwalitee packages and manual intervention more often needed compared to similar languages.

Worse are languages that simply don't have much manpower behind them in absolute numbers, e.g. CL/quicklisp. Given Python's mindshare, the results are subpar.

ptx · on May 23, 2019

> virtualenv is not installed by default

It is, but it's called venv these days:

  python3 -m venv my_venv

(Unless you're sticking to a very old version of Python, but in that case there's nothing the Python developers could do about it.)

TomBombadildoze · on May 23, 2019

> Tests are not automatically run and centrally reported as part of the installation process.

I _hate_ this working on Java/Maven projects. Why do I need to run all the tests in order to install dependencies? The tests should have been run at _packaging_ time. I'm installing binary dependencies, why should I not trust that they have been tested?

These are fundamentally separate concerns. If you're worried about the robustness of your dependencies, review them first.

> There is no good heuristic for picking one package over the other when they occupy a similar problem space. This is mostly a cultural problem, the community's efforts are lackluster.

Stars on Github? Issues opened vs closed? Stack Overflow? The community is active in all sorts of places and it's not that hard to get a feel for the prevailing best tool for a given job. It just takes a little effort. The community's responsibility is to maintain those parts which they have created, not give you recommendations.

> virtualenv is not installed by default

python -m venv

> Semiautomatic packaging tools (e.g. pypi into rpm)

If you want to package your application and its dependencies as an OS package, that's fine. Do it as a single unit and isolate it from the rest of the system. We have venv (or pyenv if you need a different version of Python) to solve this.

I've experienced every headache there is with Python packaging. It's been bad, it is now better but still rather quirky, but it does require some domain knowledge. If you know how to use it, you will rarely experience serious problems.

pfranz · on May 24, 2019

I generally avoid pip, but have to use it once in awhile. I don't like that pip installs globally by default. I've messed up too many OSes because of a library I only needed for 5minutes. I always use --user, but I don't think that obliquitous (I don't tend to use venv, either).

I was trying to upgrade a package. I first looked for a "pip upgrade/update <package>"--doesn't exist. I then tried "pip install <package>" to see if it would offer to upgrade--it didn't. I think found "pip install --upgrade <package>".

I'm sure I'd learn these things if I did them all the time, but it's some slightly poor defaults (so slightly bad and impactful I can't imagine them being changed), and some discoverability and friction I don't see in package managers I use way less often.

sametmax · on May 23, 2019

npm is worse IMO.

First it forces the node_module in your repository, which mean you have to configure tons of tools to ignore it or you'll have a bad time.

Then there is no way to have comfortable local command, so plenty of time you gotta have sudo -g. Recent npm version now have a tool for that but the experience is meh.

Plus there is no builtin tool to manage several versions of node, so you gotta add nvm on top of that.

And then if you work on the front end, you can't even use whatever you npm installed stuff, you gotta setup the entire webpack shenaigan, making it literally the hardest stuff to setup. It's so complicated we rely on black box such as create-react-app to do the job for us.

jononor · on May 23, 2019

> Then there is no way to have comfortable local command, so plenty of time you gotta have sudo -g. Recent npm version now have a tool for that but the experience is meh.

export PATH=$PATH:./node_modules/.bin/

or put the command into `scripts` of package.json, and use `npm run-script foo`.

Have never used npm -g, and never will. Global packages are owned by the system package manager.

sametmax · on May 23, 2019

Local commands doesn't mean you have a project. It can be a general utility.

motles · on May 23, 2019

"Amazingly well" has become 500MB node_modules directories in your source tree?

wongarsu · on May 23, 2019

When I install the first 2-3 dependencies that causes npm to install 200 packages. I don't particularly like that, but it could be argued that this is a testament to how easy npm makes using third-party packages and how painlessly it does recursive dependency resolution.

carlmr · on May 23, 2019

Exactly. It is such an easy to use package manager that people barely think about adding dependencies.

scrollaway · on May 23, 2019

You should take a look at your virtualenvs once in a while. They're not particularly smaller and keep in mind that python has a much larger stdlib.

alexchamberlain · on May 23, 2019

What makes it the worst? Is it non-pure Python extensions, or is there something missing for pure python packages too?

lmm · on May 23, 2019

It's broken for pure python too. The standards were too heavily influenced by traditional unix sysadmins who love server-global libraries, so that's still the direction all the tooling guides you in despite being the worst way to work with Python. Every couple of years they rewrite it without fixing any of the problems. Sometimes they make it worse, as with e.g. moving pip into python. I struggle to imagine the decision-making process that could come up with these outcomes - surely they must be studying what works for other languages or there'd be no way they'd manage to avoid it all so perfectly.

sametmax · on May 23, 2019

Not anymore, .whl files are now similar to npm packages, with static metadata and just unpacked on pip installed. Most popular packages are .whl now.

stackzero · on May 23, 2019

The problem is there's no clear way to get started with setting up a python package. Every other day someone ask a question on SO about it and the first google result is outdated distutils IIRC

d0mine · on May 23, 2019

Have you looked at https://packaging.python.org/

sametmax · on May 23, 2019

It's actually pretty ok now, espacially with .whl for pretty much everything avoiding compilation and venv being provided with recent Python. The problem is the sheer amount of bad information you find online, so that people take hours figuring how things work.

E.G:

- people telling you to sudo pip install

- people telling you to use virtualenv instead of venv

- people not telling you to do python -m

- people not telling you about setup.cfg

pas · on May 23, 2019

Pipenv!

sametmax · on May 23, 2019

That is one of the bad information: people that don't know how to use pip and venv shouldn't use pipenv, since you need pio to install it properly, and it will manage venv.

Don't run before you know how to walk.

pas · on May 24, 2019

no. you don't

you need python (and the distutils module) and you can use the get-pipenv.py

and separating venv management from pip is what causes The Mess

nomel · on May 23, 2019

> A lean and mean standard library benefits platforms with limited resources like devices with just a few hundred kilobyte of storage (e.g. BBC Micro:bit). Python on mobile platforms like BeeWare or WebAssembly (e.g. pyodide) also benefit from reduced download size.

Python, lean and mean? Seems like an incredibly niche use case to restrict the python community to.

yjftsjthsd-h · on May 23, 2019

At that point I'd expect to be using micropython anyways.

coldacid · on May 23, 2019

Considering how many different types of data have and can be stored in IFF chunks (and their order-swapped RIFF counterparts) I'm almost insulted by the PEP author considering IFF to just be "an old audio file format".

I think my Commodore/Amiga persecution complex is acting up again.

oblio · on May 23, 2019

Wasn't the last Amiga sold in 1996? That's 23 (!) years ago.

coldacid · on May 23, 2019

You can still buy Amiga platform hardware, although it's not the classic Commodore-era stuff but rather the modern PowerPC based stuff.

kd5bjo · on May 23, 2019

> 3.8.0b1 is scheduled to be release shortly after the PEP is officially submitted. Since it's improbable that the PEP will pass all stages of the PEP process in time, I propose a two step acceptance process that is analogous Python's two release deprecation process.

Why should this be fast-tracked outside the normal process? I can’t imagine any of these removals are urgent.

0xbadcafebee · on May 23, 2019

If we're going to lean more on PyPI, it would be nice to clean it up some. There's a lot of cruft that makes it very hard to find the right module to use to write a new project with. And it would be nice if package naming convention were more standard, and suggested extending existing modules rather than writing entirely new ones. If you want HTTP functionality, you use the "requests" package, which uses urllib3. Why not io::net::http::client, extending io::net::http, and so on?

torlakur · on May 23, 2019

Well, that sounds like the package names (and namespaces) that the Perl 5 community have landed on with CPAN modules :)

aitchnyu · on May 23, 2019

Some batteries are surprising https://docs.python.org/3/library/

difflib - Text diffs, even html output

textwrap - obvious no?

rlcompleter - autocomplete symbols and identifiers, used in interactive mode

pprint - for printing complex data structures with indentation

reprlib - repr with traversal depth and string size limits fraction - Fraction('-.125') becomes Fraction(-1, 8)

statistics - averages and deviances tempfile — Generate temporary files and directories

glob — Unix style pathname pattern expansion

gzip, bz2, zipfile, tarfile - obvious

configparser - ini-like format

secrets - use this instead of random for safe cryptography

sched — Event scheduler

turtle — Turtle graphics

shlex — Simple lexical analysis

webbrowser — Convenient Web-browser controller, for the antigravity module!

sametmax · on May 23, 2019

textwrap - used by Python itself for docstring.

rlcompleter - used by Python itself for the shell.

pprint - trememdously useful for debugging.

statistics - added recently because people kept rewritting mean() functions that were broken

tempfile.gettempdir(), glob("*.ext"), gzip, bz2, zipfile, and tarfile are kinda mandatory for a language you use massively for scripting

secrets - we had random. People used it for security all the time and created stupid security holes. So we added this.

webbrowser — fantastic module that open a new tab in the default browser from any code. It's one file, so for the value it provides, I'm happy it is here.

Spiritus · on May 23, 2019

Surprising how? I've used most of the "batteries" you mentioned in the past year.

int_19h · on May 23, 2019

Turtle graphics is really handy for teaching programming. And it's important for new users to have minimal friction when setting things up, so having it in stdlib does make sense.

_fbpt · on May 23, 2019

LZMA is better than gzip i think.

Also secrets deserves to stay in the library, Python has hashlib and it should have a secure RNG by default.

glob too (or Path.glob).

And I've used shlex for command-line-escape parsing (it might not have been the optimal solution?)

masklinn · on May 23, 2019

> LZMA is better than gzip i think.

LZMA provides much better compression at much higher costs. Generally speaking it's pretty strictly better than bzip2, not necessarily gzip (DEFLATE, really).

In my experience, zstd can be considered better than gzip/deflate (almost every time I tried it, it provided as-good-or-better compression at much faster throughput).

maxnoe · on May 23, 2019

Yes, zstandard turned out to be the best option in all our tests on small to really large data (couple of mb to gigabytes filesize).

A few percent better compression than gzip and nearly 50 % faster decompression.

Gzip is pretty slow in the python standard lib.

The python zstandard bindings unfortunately do not allow back seeking.

monocasa · on May 22, 2019

MSI support removal seems premature. Can you do unsandboxed stuff from an AppX? Also can you install AppXs on Windows 7/8?

chungy · on May 22, 2019

How many people actually used Python's MSI library?

It was Windows-only (presumably a wrapper around the native tooling), and was primarily for creating Python's own installer, which apparently doesn't get built as an MSI anymore.

ptx · on May 23, 2019

I use it for packaging my Python application. But it seems pretty low-level, so I suppose I could call the win32 API directly through ctypes when they remove it.

WorldMaker · on May 23, 2019

You've been able to do Win32 stuff in an AppX since early in Windows 10 history, it's mostly "unsandboxed". MSIX, the "new" AppX, has Windows 7/8 support for Win32-focused installs in preview.

Semi-relatedly, Microsoft has already been exploring with the Python team MSIX deployments for the Python interpreter itself (you can even find it in the Microsoft Store now on Windows 10, and typing `python` on a command line in recent builds of Windows 10 will auto jump you to the Store if you don't have a python.exe in your PATH/installed).

WorldMaker · on May 23, 2019

(Also, getting in the weeds, the "mostly unsandboxed" refers to that MSIX installs will do a few things to insure at all costs clean uninstall, including system folder virtualization/redirection, registry virtualization/redirection, and similar techniques. Windows and MSI has been doing a lot of this sort of thing under-the-hood silently since XP as a part of compatibility work, MSIX just does it more obviously/automatically/always.)

Alex3917 · on May 23, 2019

The only thing I worry about is the uu module. Even though 99.99% of people will never use it, there are also probably literally millions if not billions of pieces of content that are encoded using that standard. I realize it's only a few lines of code, but from an archival perspective it seems like there's something distasteful about making it harder for future generations to access content that may have historical value. At least send a quick note to the library of congress or the archivist community first to get an idea of to what extent this stuff is getting used.

naniwaduni · on May 23, 2019

The useful functionality (encoding/decoding) is covered by binascii, which uu in fact uses internally.

What the uu module proper provides is a pair of awful file-to-file interfaces. Default to spitting the decoded file directly to disk at with the name in its header is, uh, questionable.

roblabla · on May 23, 2019

This module is pure python. It looks like if people rely on it, they could easily put the code in a pypi module, and maintain it there. It doesn't really make sense to keep it in the core if it has no dedicated maintainers.

takeda · on May 23, 2019

That code doesn't need any maintainers, it is not like UU encoding is evolving or really complex.

guitarbill · on May 23, 2019

That's a bit dramatic. Our knowledge of the encoding doesn't go away because Python removed support. Old versions will continue to work. The code is preserved in VCS. And finally, you/someone can release a 3rd party module to PyPI to fill the gap.

owaislone · on May 23, 2019

Isn't it just being removed from the standard library? It can still exist on pypi especially if it is as important as you suggest, someone will definitely host it on pypi.

masklinn · on May 23, 2019

As noted in the PEP, the uu codec itself is in `binascii` and not moving anywhere. `uu` is just a (pretty awful) high-level interface.

ehsankia · on May 22, 2019

I honestly didn't realize there was so much random stuff in stdlib.

varelaz · on May 23, 2019

There are much more of it. I think around 1/3 of standard library is legacy and burden for python. I recall there was a PyCon talk about why you may not want to be a part of standard library. Release process and support of old versions is very strict and heavy, it could delay releases and API change dramatically. You can look at situation with urllib for example, there are 3 versions of them, but most popular is requests 3rd party module.

yingw787 · on May 22, 2019

This seems like a vast swath of things to deprecate with one PEP, but it looks like the author of the PEP talked with a large number of core developers about issues that they had with the stdlib and decided deprecated packages that way, which I think should be good for approval of the PEP.

I have a couple of questions:

- I realize there may not be a fork with this PEP implemented, but how might this impact Python's local relative build time, and how might that convert over to the build pipelines? Drastically faster build times would be really nice.

- Are Python 2 -> 3 migrations for stdlib packages mostly rewrites or mostly tacking on compatibility layers like `from __future__ import unicode_literals`? If stdlib packages were updated with Python 3 syntax, it might indicate sustained demand for said package going forward. I'm not sure.

Znafon · on May 22, 2019

> Are Python 2 -> 3 migrations for stdlib packages mostly rewrites or mostly tacking on compatibility layers like `from __future__ import unicode_literals`?

Packages in the stdlib don't need compatibility layer since they are always used with the Python version they were written for, it's motly incremental rewrites.

Regarding the build time, I don't think most of the time is spent testing those parts of the stdlib.

enedil · on May 22, 2019

I'm not the OP, but I suspect that what was means was backporting features to py2.7, which presumably use some new Python 3 which haven't been built in the old language version.

simcop2387 · on May 23, 2019

I don't think that back porting features to 2.7 it's in anyone's radar anymore given that it goes end of life in just over 7 months. Anything that isn't already there is likely not going to benefit many people

Dylan16807 · on May 23, 2019

Lots of people are on python 2, and if they didn't already migrate then an EOL notice isn't going to be a big motivator.

dual_basis · on May 23, 2019

You'd be surprised I think. In my job, EOL noticed are all it takes, often, to push people onto the next version, even reluctantly. I imagine many situations where Python 2 is still used are not developer driven but, rather, business value propositions. Many of my clients wouldn't understand a discussion about upgrading because of Unicode strings, but if I were to mention to my clients that Python 2 is no longer receiving bugfixes and officially has reached EOL status you can bet they would all throw money at it.

kbumsik · on May 22, 2019

> how might this impact Python's local relative build time

stdlib usually written in Python, not in C. So I guess it won't affect the build time much. It will reduce test time but I'm not sure.

jstimpfle · on May 22, 2019

https://docs.python.org/3/library/fileinput.html

I've never had a strong opinion about "batteries included", but boy there are some weird ones in there...

ehsankia · on May 22, 2019

Especially when you start looking at the code for some of these

https://github.com/python/cpython/blob/master/Lib/imghdr.py

Znafon · on May 22, 2019

To be fair, it dates from 1992. I don't think PyPi existed at the time, it's rather impressive the Core team supported this for so long.

tomjakubowski · on May 22, 2019

Ruby includes something similar, ARGF. It's pretty useful over there. https://thoughtbot.com/blog/rubys-argf

jstimpfle · on May 22, 2019

Well, maybe fileinput is meant to be used for making shell one-liners using the -m switch or so, like in the link you posted, or like perl/awk. Does that play nice with Python's block syntax, though?

mftrhu · on May 22, 2019

Why do you think fileinput is weird?

jstimpfle · on May 22, 2019

The functionality is almost less than trivial. It's like, trading 3 lines of super straightforward code for one line of import statement and a library dependency.

Pretty sure it tries to emulate awk's programming model, but even if I had a use for that I'd rather write these 3 lines of code myself - so the code is actually clear without looking at documentation, and so that it can be modified easily.

bpicolo · on May 22, 2019

I actually really like fileinput - to me it's a great example of the good kind of batteries included. I appreciate easy built-in utilities for super common programming tasks! Those are great to have.

Different strokes I suppose?

pjungwir · on May 22, 2019

Not just awk but cat, grep, etc.---even python itself. It's nice to have something that "recommends" such a common Unix convention, a lot like getopt but moreso. In Perl this would be `while (<>) { ... }` and in Ruby `ARGF.each_line do |line| ... end`. I'm glad they are keeping it!

simcop2387 · on May 23, 2019

Actually with perl its even easier. perl -ne '... Process lines...'

fanf2 · on May 24, 2019

There’s a big risk with using <> in perl, if any of the command line arguments might come from somewhere untrusted - if you use perl it’s worth reviewing the discussion under https://perldoc.perl.org/perlop.html#I%2fO-Operators

daveFNbuck · on May 23, 2019

It was on the PEP's original list of things to remove, but they decided to keep it because it's handy for quick scripts. I use it all the time for scripts that I delete shortly after use.

naniwaduni · on May 22, 2019

Filters over lines of text may be domain-specific, but it sure as hell spans a lot of domains.

jstimpfle · on May 22, 2019

Idk, it never occurred to me the need for a library that does more than "for line in sys.stdin:" but less than what you can easily write in under a minute.

rabidrat · on May 22, 2019

It handles files being passed on the command line too. Your particular usage might only take a minute, but to make it play nice with other expected CLI usages, takes more thought (and you will probably forget something).

masklinn · on May 23, 2019

> Idk, it never occurred to me the need for a library that does more than "for line in sys.stdin:"

Of course if you'd only used fileinput for that it would be worthless.

However what it does is way more useful, namely iterate on the lines of all files provided as parameter (or sys.argv[:1]), fallback to sys.stdin if no files were provided, and swap the special sentinel `-` by sys.stdin.

anoncake · on May 23, 2019

You depend on the standard library anyway.

jstimpfle · on May 23, 2019

So using the library doesn't have a cost?

Liquid_Fire · on May 23, 2019

The cost to importing an extra module is trivial.

Why reinvent the wheel, when Python ships with an already written and tested solution? The 3 lines of code you would write are probably going to miss some corner case.

If I'm reading some unfamiliar code I would rather see one use of a standard library function call than several lines that I have to read and understand. And if it was really the first time I came across the `fileinput` module, I only have to pay the cost of reading the documentation once, and then forever benefit from having to read and understand less code whenever it is used.

jteppinette · on May 23, 2019

I found 330k references on GitHub.

jstimpfle · on May 23, 2019

They all look the same. https://github.com/search?l=Python&q=fileinput&type=Code

aepiepaey · on May 23, 2019

That's because it includes all the repos with Python forks/copies.

Here's a better search, still 100k+: https://github.com/search?l=Python&q=%22import+fileinput%22&...

larkost · on May 22, 2019

My one objection to this is the `imp` module. As far as I can tell it is the only way (short of `sys.path` modification) to specifically load a module found at a specific path. For testing systems I use this quite extensively... Looks like I need to make some comments...

saila · on May 22, 2019

You can do that with importlib too:

    from importlib.machinery import SourceFileLoader

ChrisSD · on May 22, 2019

That's been deprecated as well. The current docs say you should do this

    spec = importlib.util.spec_from_file_location(module_name, file_path) 
    module = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(module)

https://docs.python.org/3/library/importlib.html#importing-a...

comex · on May 23, 2019

Gross...

tgb · on May 23, 2019

I use imp.reload(my_module) all the time when %run-ing things from ipython that import my_module after editing it. What's the alternative for that use?

maxnoe · on May 23, 2019

Using importlib.reload

carreau · on May 23, 2019

In[1]: %load_ext autoreload In[2]: %autoreload 2

?

mleonhard · on May 22, 2019

I love this. I hope more and more cruft gets removed from our languages and tools.

0xADEADBEE · on May 23, 2019

This has been a long time coming and I'm surprised it's taken until 2019. A better time to have done this would have been back at the 2-3 migration; I'm not sure the community will be able to survive another transition like that, but hopefully lessons have been learned from the previous schism. A step in the right direction for sure!

saila · on May 22, 2019

Just one data point here, but with the exception of the cgi module a long time ago, I haven’t used any of these modules in the past 15 years of web programming, ETL scripts, or data processing. In the keep list, I’ve used fileinput once that I can remember.

Waterluvian · on May 23, 2019

I think an 80/20 solution is documentation UX. Just take all these old modules and put them into an "old modules" section and preface that section with a short discussion on what this all means and what the wisdom is.

robobro · on May 23, 2019

noo not the cgi module :(

ptx · on May 23, 2019

You can use wsgiref.handlers.CGIHandler instead, with the advantage of being able to easily move the application to some more appropriate WSGI environment later if needed.

zestyping · on May 23, 2019

I was wondering about that too. CGI is still just about the easiest and most universally supported way to serve dynamic web content from a Python script.

The complaint is that the module is designed poorly, which is fair; to remove CGI support without any "batteries included" replacement seems a bit of a shame, though.

mjw1007 · on May 23, 2019

The cgi module doesn't actually contain anything CGI-specific. It dates from a time when "CGI" was a sensible shorthand for "handling http requests".

What's actually in the module is a rather clunky, but serviceable, system for processing html form data.

skykooler · on May 23, 2019

Especially as, if you're in a situation where you have to use CGI rather than a better server worker solution, you probably also don't have access to the environment to install pip packages.

draegtun · on May 24, 2019

You have to keep moving forward with the times!

Perl removed CGI from its core back in 2014 - https://perl5.git.perl.org/perl.git/commitdiff/e9fa5a80

icodestuff · on May 22, 2019

I'm surprised at the removal of aifc. AIFF is still used somewhat in macOS.

schrijver · on May 23, 2019

AIFF is used widely in audio production software; for exporting uncompressed audio many application default to AIFF. It’s the most universally useful lossless format; WAV support is ubiquitous too but AIFF has better metadata. FLAC and ALAC are good alternatives today, but not on the same level of support yet. For example, for purposes of price differentiation, Pioneer reserves FLAC and ALAC support for their most expensive digital turntables.

If you’re writing a Python script that outputs an audio file, outputting to AIFF seems like a save bet. Then again I’m not sure how many people are actually using the module from the stdlib--but neither does the PEP’s author, it seems. His level of research was asking his friends on Twitter:

https://twitter.com/ChristianHeimes/status/11302577994753351...