Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Yes, Python is Slow, and I Don’t Care (hackernoon.com)
136 points by mithunmanohar1 on Sept 8, 2017 | hide | past | favorite | 194 comments


I was all ready to savage his opinion after reading the headline but I agree looking at my architecture that I designed for the company I work for, CPU isn't the bottleneck. Every time I try to increase performance by multi threading as much as possible, the databases start screaming.

On the other hand, the idea that dynamic languages are more productive than static languages are laughable. Statically type languages prevent a lot of bugs and allow for a lot of automated provably correct refactorings that simple cannot be done with a statically typed languages. You can't even reliably do a "find usages" of classes using a dynamically typed language.


>On the other hand, the idea that dynamic languages are more productive than static languages are laughable. Statically type languages prevent a lot of bugs and allow for a lot of automated provably correct refactorings that simple cannot be done with a statically typed languages. You can't even reliably do a "find usages" of classes using a dynamically typed languag

Exactly, I get quick and precise code completion, I catch plenty of errors beforehand etc. I'd say I'm about 10x as productive in C# as in Python, with similar amount of experience. Python only shines when there is a library that does something really well that you need. For me any productivity advantage in Python is from lots and lots of libraries.

Also in terms of maintainability, I find my C# code easy to read and modify a year later when I've forgotten completely about it. In Python I need to rescan all of the types into my head until I can understand what the program does.

I mean with var and dynamic, C# offers everything you need for duck typing efficiency, while preserving the very important statically typed interfaces.


If you annotate your types in python with the appropriate type hinting comments I find I get quite satisfactory code completion and error detection from my IDE (pycharm). Perhaps not as much as VisualStudio might do for me with C#, true, but then again I don't use them for the same thing either.

Each task has it's set or preferred tools, to write a quick command line tool to bridge a gap between two systems and automate our workflow a bit further python is just perfect for me and much more productive than C#. However to create a system with a team of dozens spanning the whole skill gamut I feel much more comfortable having the compiler as an active member of my team policing the architectural vision for us. In which case C#/Java will win hands down.

If however speed and responsiveness are prime worries then I'll roll up my sleeves and bring C/C++ in. sure my productivity will take a hit, it will likely be harder to maintain as well but if this effort makes the difference for my customers in their productivity then it's well worth the investment.

Often times the final system will be a mix of all these technologies, each used where it's strength shines to maximise their impact.

Yet the technology is worthless when faced with inadequate code structure and system architecture. Coupled convoluted code will be a PITA no matter the language used.


I feel this about Haskell compared to Ruby: Really easy to jump back in to code I set aside months ago.


>In Python I need to rescan all of the types into my head until I can understand what the program does.

Couldn't that be solved with sane variable naming conventions and docstrings/documentation?


Maybe. But it can be more easily solved on a strongly typed language where you can right click on a method and do "find usages" and it can. E done algorithmically.


Of course that depends how much more you have to write in one language compared to another. Say you're 10x more productive but have to write 5x code, that sort of cuts down the advantage somewhat.


Who cares about writing a lot of code that is boilerplate really?

That is not where productivity sinks are. They're in build systems, package managers, existence of good libraries and ease of debugging as well as limiting number of accidents while coding.


    dynamic f = x => x * 2;
Duck typing doesn't seem to be offering everything.


You sound like someone who hasn't used dynamic languages in anger, or you'd mention some of the things that dynamic languages do well that statically typed languages aren't so great at, to prevent your argument sounding like a straw man.

For example: dropping into a debugger (binding.pry / pry_remote in Ruby) to write code interactively in the context of the application, transferring that code to the source, and continuing on with my next fragment of functionality with a refresh of the web page, no recompiles or restarts required. You can do this with some difficulty and certain caveats in a statically compiled language, but only if things have been set up in just the right way, with automatic dependency recompilation and reloading, hot code replacement, etc. And even then, there are limitations on the kind of code you can write in the editor, depending on the underlying technology. Typically you can't create whole new classes, or introduce new fields etc.

Or consider levering up your language. Dynamic languages often give you the ability to create "new" syntax via expressive literals. This opens up more avenues for declarative programming; construct a data structure that models your problem more directly, and then write code that interprets the data structure. You want extremely lightweight literals for this, readable with minimal ceremony, including lambdas for when you need an escape hatch, a little pocket of custom code embedded in the bigger data structure. Statically typed languages have little context to infer types for such lambdas, unless they can generalize from the operators applied, but then you have a generic function, not a piece of data. So you end up infecting your DSL with type annotations and cruft, and before you know it the whole thing is hiding the forest behind the trees.

Thing is, if you aren't used to using these tools, you might not even know they exist, and you may be missing out on productivity you didn't know you could have.


I've come to the realization that static vs dynamic is more of a way people think about problems than anything else.

Personally, I've never met a static language that was remotely close to dynamic in terms of productivity...but a big portion of that is the sheer volume of extra code necessary in a static language combined with personal workflow (as you describe).


You're describing tools usable in writing new code.

What static languages thrive in, however, is maintaining old code that was written by somebody else.


Which is the reason why it makes sense to write new applications in a dynamic language, and later port critical parts of it to more static compiled ones.


s/port critical parts of it to/dump everything and rewrite from scratch in


Sure, whatever it takes. I've just written the same thing at another recent thread, so we agree on that.


I'd take the point about literals a half-step further. It's really nice to be able to throw a string at a function and have that function compare it to a literal value using == instead of strcmp - and oh yeah, someone was just about to reply saying I should always use strncmp for extra productivity loss. Or I can throw a list/dict of any random type around without lots of ceremony. Do I lose type checking? Sure. Then again, those "expected int, got unsigned int" warnings aren't very helpful sometimes. "Must compile without warnings" is a good idea for long-term maintainability of long-term code, but when you're writing a simple script or still experimenting with an algorithm that crap just gets in the way. A minute or two defining and aligning types might not seem to much, but when you have to do it ten times in the course of what was only a one-hour task to begin with it's a pretty significant hit.


It's really nice to be able to throw a string at a function and have that function compare it to a literal value using == instead of strcmp - and oh yeah, someone was just about to reply saying I should always use strncmp for extra productivity loss.

Who uses straight C? C++ and C# have operator overloading. You can overload any of the operators for a class and get sane semantics for addition, subtraction, equality, etc.


A lot of the largest and most critical codebases, probably including most of the software you cause to execute every day, is still in C. Also, operator overloading doesn't solve the general problem and creates a few of its own, but thanks for the condescension anyway.


For example: dropping into a debugger (binding.pry / pry_remote in Ruby) to write code interactively in the context of the application, transferring that code to the source, and continuing on with my next fragment of functionality with a refresh of the web page, no recompiles or restarts required. You can do this with some difficulty and certain caveats in a statically compiled language, but only if things have been set up in just the right way, with automatic dependency recompilation and reloading,

You've been able to do that with ASP.Net MVC for years. You can change your views hit save without having to start over.

But on the back side, why is "restarting" such a heavyweight operation? If I'm iterating over a piece of back end code that takes a while to get to debug, I use the testing framework as a harness to run small bits of code - not a real unit test just an easy way to setup the prerequisites to step through a piece of functionality.

This opens up more avenues for declarative programming; construct a data structure that models your problem more directly, and then write code that interprets the data structure.

Even C++ has a level of reflection these days. But why wouldn't you be able to do that with c# and attributes or Java with annotations?

You want extremely lightweight literals for this, readable with minimal ceremony, including lambdas for when you need an escape hatch, a little pocket of custom code embedded in the bigger data structure. Statically typed languages have little context to infer types for such lambdas, unless they can generalize from the operators applied, but then you have a generic function, not a piece of data.

C# has had type safe lambda for a decade. And as far as a generic function, you can define a lambda/Linq expression that can be reused and that can generate c# IL code at runtime, a MongoQuery when using a the Mongo driver or SQL Expression when using Entity Framework. Ling expressions and the expression trees they generate are very powerful, versatile and typessafe.


Hot reload works well in Java and Dart (using IntelliJ).

As you say, you generally can't add new fields or classes, but it's not clear to me how important that is? Doing a full restart occasionally is not that big a deal.

I am skeptical about the long-term maintenance of systems with custom DSL's. Unless it's something popular like JSX, it seems like you end up with your own language dialect that nobody else understands?


I came here to make the same point about statically typed languages. My previous job mostly involved writing web servers/services in Python. A lot of the existing unit tests in our codebase were solely there to check for type safety (one of the developers that had been there for most of the lifetime of the codebase over engineered a bit with OOP in Python). Now at my current job, we use Scala. I still prefer Python for scripting, but it is very nice to be able to write code and have the compiler do most of the work when it comes to making sure everything fits nicely together.


This has been exactly my experience after working for years in Ruby and returning to Java & Golang. It truly does take me longer to write code without my precious Ruby, but that time is more than made up for by not having to write tests for every single possible code path, purely to check for type safety.

I've also experienced an unexpected benefit in that it is a lot easier to read other people's code since types are explicit and easy to see.


"the idea that dynamic languages are more productive than static languages are laughable." -- being statically typed or dynamically typed comes with its own set of tradeoffs and what a person is more productive in is a highly subjective matter. Lispers are more productive in Lisp than Haskell and vice versa.

"Statically type languages prevent a lot of bugs and allow for a lot of automated provably correct refactorings that simple cannot be done with a statically typed languages." -- not true, Clojure is trying to do that with Clojure.spec and specifications being checked at runtime can get you closer to things you could have automatically proved correct only with languages with dependent types, nothing against statically typed languages but I feel that your sweeping generalizations hurt the point you are trying to make.


I recently saw an analysis of frequency of bugs per language according to a breakdown of Github repositories, using issues and branches as a way to quantify bug reports.

It was very interesting to see that Clojure projects were among the least buggy of all languages represented. The report looked at a variety of factors to explain "bugginess", including typing and LOC, among other things. The conclusion in Clojure's case was that the very low line count necessary to write Clojure programs was vastly more influential on bugginess than for languages that were statically typed.

And I suppose that's true: a language that requires you write very little code is going to produce programs that are easier to write (and thus maintain), regardless of whether it is statically or dynamically typed.


That idea is old enough to have entered the original edition of The Mythical Man-Month.

That said, I have never seen any study like this where I didn't have some serious disagreement over methodology or conclusions. (Ok, except for that one on Peopleware.) It is a hard subject, and running a study over "the population of GitHub" is problematic by itself.


In my own subjective experience, I've found the idea to make a lot of sense and probably be true. I've done equal amounts of professional work in C++ and in Clojure, both languages I like a lot (and they are about as different as two languages can be). To accomplish a task in C++ takes probably 10x the code of Clojure. I would say that my bug-hunting time spent is maybe 5x in C++ vs. Clojure. This, despite that C++ is quite strictly statically-typed.


That is my impression too. I find that the most relevant metric impacting the bug density is number of lines on this one project, and it's very non-linear.

That puts at advantage both languages that reduce line count, and languages that make it possible to abstract the plumbing into a generic library (if it's not generic, it's part of this one project). I think mostly everybody has that same impression.

Still, we shouldn't pretend that impression is a scientific certainty.


Sounds like an interesting analysis, would love to know where Erlang stands. Terseness definitely is a factor in play but I think at the end of the day it is all about where the priorities lie in language design, some languages really take errors and fault tolerance seriously, Erlang is famous for that and it is dynamically typed, same with Clojure, being a lisp the language design is much less rigid which allows for improvements like core.typed and Clojure.spec


Clojure is trying to do that with Clojure.spec and specifications being checked at runtime can get you closer to things you could have automatically proved correct only with languages with dependent types, nothing against statically typed languages but I feel that your sweeping generalizations hurt the point you are trying to make.

"Checking at runtime* is exactly the problem. Why would checking at runtime be more reliable than compile time?


>"Checking at runtime is exactly the problem. Why would checking at runtime be more reliable than compile time?*

If you use metaprogramming (i.e. Lisp macros, Scheme/Racket macros, Clojure macros), then the only way to have truly reliable checks is at run time.

In fact, for debugging and testing, it's much better to have a runtime able to do many things (such as live patching of the code).

Contrary to a language like C or C++, where the compiler produces machine language from the source and then goes out of the picture, in most of the Lisp family languages your program runs along with the runtime; this runtime also contains the compiler itself and is usually big on features; it will not just catch the type errors, it will show you exactly all kind of important information about any, error and will allow you to correct the error if you like, without having to recompile and start again.


Contrary to a language like C or C++, where the compiler produces machine language from the source and then goes out of the picture,

I use C#. When you create a Linq expression, you get type safety and it "compiles to" an expression tree that is converted to its runnable form at runtime -- in my case a Sql statement, a MongoQuery, or C# code. It really comes in handy. Theoretically I could change my backing Store from Mongo/C# driver to Sql Server/EF without changing any of the queries. I do this all of the time when unit testing, I substitute an in memory list for a database table and still get complete type safety.


> If you use metaprogramming (i.e. Lisp macros, Scheme/Racket macros, Clojure macros), then the only way to have truly reliable checks is at run time.

That is a misconception. The macros output a target language and that can be checked. Plus macros can do their own checking before that. A macro expander can perform some static checks. In TXR Lisp, warnings about unbound functions and variables come from the macro-expanding code walk.


In addition to the points flavio81 made, checking at runtime because unless the language is dependently typed i.e. has types as first class citizens(for example Idris) a lot of the information about your code you want to prove to be correct is not available at compile time(for example, you have an integer but is it a non-negative integer, a list but is it sorted). It is not about being reliable, it is about being possible.


A run-time check will work even if the compiler has a bug which mistranslated the code or neglected to do a check.


'Dynamic languages' is too often used as a shorthand, or interchangeable with 'scripting language', as here. There are a ton of more relevant language features when it comes to productivity, automatic memory management being the biggest IMHO. Interpreted vs compiled makes a difference too because your code->launch->test cycle can be so fast. But I agree, I'm a huge Python fan and even though I appreciate not having to gum up my code with type declarations, it's not really that big of a win in terms of productivity.


But with a statically typed language, you don't need to go through that code -> launch -> test cycle as often because everything is type-safe.


I recently found and fixed a 5 1/2 year old bug that was caused by a caller assuming that an integer parameter is an absolute quantity, but the function treated it as a delta. (The really funny thing is that the commit which introduced this bug introduced both that function and the incorrect call to it; and then, for the next 5 1/2 years after that, subsequent new uses of the function elsewhere in the code all correctly use its argument!)

The expressions x - y and y - x often have exactly the same type tree, but calculate a different value. Same with f(x, y) and f(y, x) when the parameter types are the same.

To actually catch real bugs such as this with types, you have to make such a fine grained use of types that programming becomes nigh intractable.

If x has type minuend<integer> and y has type subtrahend<integer> then, x - y is well typed, whereas y - x isn't. Problem is, the values come from some other context in which their minuend and subtrahend are meaningless (and, in any case, are not involved).

The kinds of errors that are found by static typing in practice are those that, under dynamic typing, will only sneak into production if the code is not tested at all. I.e. if the code had been executed even once with any inputs, the problem would have showed up.

Plus, if you assume that "dynamically typed" language means "language with no static checks", you're arguing against a strawman of dynamic typing. That is even more ridiculous if you're pitting True Scotsman's state-of-the-art static typing against this dynamic strawman.

(Of course, some dynamic languages in widespread use do more or less look like that strawman; however, the best examples of dynamic typing are languages which are considerably informed about type prior to executing a program, like the SBCL dialect of Common Lisp, for instance.)


This is why we should really push for automatic logic checkers and proper support of programming by contract.


You can say that something is a floating point number (a type) but what if that something must be between zero and one? A type can't tell you that.

Typing has its place but the idea that it catches all bugs at compile time certainly isn't true; the cycle of running the app itself, running its tests, is just as important for statically-typed code as dynamically typed.

Languages that have a sophisticated contract system* (like Clojure's Clojure.spec library) have a bit of a leg up in my opinion. You can much more precisely explain and control and test for the specifics of all values flowing through your system well beyond merely what type they are.

-- * Clojure's devs don't refer to spec as a contract system because it is quite a bit more versatile than that, and some are actually using it for compile-time checks, compile-time generative testing, and other nifty things that aren't covered by a "contract" system.


> You can say that something is a floating point number (a type) but what if that something must be between zero and one? A type can't tell you that.

Of course it can. The idea behind a decent type system is to make illegal states unrepresentable. You define a type which contains a value between 0 and 1 (lets name it `probability`) and make a function `float -> probability` which checks the range. That way, everytime you'll see `probability` in your code, you will know it is in the right range.

The problem with spec is that it can only verify what it sees on run time, whereas a static type system by definition has 100% coverage over your entire program.


You are talking about a function (at runtime), not a type, verifying the range. And you can do that in any language, regardless of its stance on typing.


http://www.adaic.org/resources/add_content/standards/05aarm/...

Compile-time and run-time floating point numbers constrained by precision and range. It's been a long time so I'm just going to include the examples in that document:

  type Coefficient is digits 10 range -1.0 .. 1.0;
  type Real is digits 8;
  type Mass is digits 7 range 0.0 .. 1.0E35;
  subtype Probability is Real range 0.0 .. 1.0;
These will be checked (to the limits of the ability of the system) at compile-time, and by default enforced at run-time. You can disable the run-time checks if you want for the sake of performance. The compile-time checks will "fail" to detect things where computations beyond the compiler's ability to reason might produce an invalid value.


Cool, but I don't know much about Ada, and would like to see an example for a statically-typed "range checker" in a more mainstream language used actively today. C++ templates can do a lot of magical stuff at compile time but such techniques are well beyond the reach of most developers.

I can't think of any typed language I've used or seen in active modern professional work where you could validate a value as part of a type at compile time.


C# with "code contracts" allows you to annotate function arguments with valid ranges which will be statically checked at compile time. It's quite verbose though.

https://docs.microsoft.com/en-us/dotnet/framework/debug-trac...


Not without extra-compiler tooling like other static analysis tools, you're probably correct. I haven't had a chance to use much Ada, it's always been a pleasure for me to work in because I get to use things like the above to make my code so much clearer.


I think you can get quite far in C++ with a class that has a constexpr constructor taking a double that checks for range, and (quite a few) operator overloads that allow one to do the normal operations on such quantities.


Pascal has integer range types, but Pascal hasn't been mainstream for a long time now.


> And you can do that in any language, regardless of its stance on typing.

Yes and no, because nothing prevents you from passing the wrong kind of value in an untyped language unless, well, you check for the runtime type yourself in the function and reject that. And you can only do that at runtime, at which point you'll not be able to immediately see where you're passing in invalid data accidentally unless you hit this exact branch in your code. So actually just the type tagging is quite useful already.

Also, it looks like more powerful dependently-typed languages like Idris[0] (and probably also Agda and Coq) can encode this information in the type system.

[0]: https://stackoverflow.com/a/28436452


what if that something must be between zero and one? A type can't tell you that.

Actually, it can, using dependent types :) I'm a Python fan, but Type-Driven Development with Idris has been a very interesting read.


Yes but there are no mainstream or production-ready dependently typed languages, so I didn't mention that. But I wouldn't be surprised if dependent typing really is going to be the distant future of programming languages. Perhaps not too distant.

I also wouldn't be surprised if Clojure tends to do surprising things in this area with taking Clojure.spec more to the compilation stage; some of the most interesting research into type systems has actually be done by lispers (i.e. Typed Racket, Core.Typed) and there is active interest in it among the Clojure community.


> Yes but there are no mainstream or production-ready dependently typed languages, so I didn't mention that.

(For a number-type that is limited to eg be between 0.0 and 1.0):

Ada?

http://www.adahome.com/rm95//rm9x-03-05-07.html


Ada provides ranged types but is not a dependently typed language. So I guess it sort of falls into this particular example, but would not apply to other examples in this arena.


> You can say that something is a floating point number (a type) but what if that something must be between zero and one? A type can't tell you that.

Sure, it can; many common numeric types include specific ranges, the fact that the range 0-1 doesn't match any common floating point type doesn't mean that it isn't one that could be a type.


> many common numeric types include specific ranges

Can you give me a mainstream language example?


Every fixed-size numeric type.


> A type can't tell you that.

Yes, it can! --- but you don't necessarily want to confound your program with such a thing.

The old adage about the cure being worse than the disease applies.


> Yes, it can!

Ok, how? Other than an Ada example, or dependently-typed languages that aren't use in production, can you offer an example?


> Other than an Ada example, or dependently-typed languages that aren't use in production...

Every time someone gives you an answer, you modify the question to exclude it. Eventually you will have excluded everything that refutes the fallacy in your original post, but it is still there.


Perl 6, for instance, supports subset types such as:

  subset Positive of Int where * > -1
OTOH, while some Perl 6 supports some static type checking, subsets of this kind seem to support only dynamic, not static, enforcement.


Such subsets are readily available in all dynamic languages I've used. But the issue here is really about compile-time checking these things, and that's a different animal.


    case class Probability(value: Double) { require(value <= 0 && value >= 1, "probability must be between 0 and 1" }


What's the language and what does "require" do? Is that a compile-time mechanism?


Uh ... no. Type safety is rarely a problem from my personal experience (YMMV) in code dev. The issues I run across have to do with whether or not the code is an accurate reflection of the algorithm in question, or even if the algorithm itself is correct, if core assumptions are correct, dealing with corner cases, etc. Types rarely have anything constructive to add to this mix, regardless of whether or not I am working in a weakly/non-typed language, or strongly typed language.

I personally tend to do something much closer to TDD, and break my code out so I test each part with its own test rig. So I see when I do dumb things that need re-work. Again, typing is pretty much never an issue there.

While this is all anecdotal, I personally have found that enforcing boilerplate type defs actually increases my coding error rate. More LOC gives you more opportunities for errors.

FWIW, Julia (http://julialang.org) is IMO Python done right, and it is strongly typed. But done in such a way as to not actually get in your way most of the time. Like Python it has a great REPL, fantastic FFI, rapidly growing module list, very active development. Unlike Python, it has no issues with whitespace[1], is compiled (JIT for now, though static is planned from what I understand), is very fast, has multiple dispatch and a strong typing system.

The real point the article was making is, programmer time is the most valuable thing to optimize for. I completely agree with this, even though I disagree that Python is the right tool for (most) every job.

There are better tools IMO, which allow me to be far more productive, and spend as little time as possible on worrying about types ... which are rarely ever a concern for me.

I thought his comment about string processing was interesting, though he completely ignored Perl in the mix. I've found string processing in Perl to be almost trivial. Its harder (far more verbose boilerplate) in Python, and gets worse in other languages. Perl6 lets you embed grammars and write parsers without resorting to external modules (Marpa::* in Perl5). Takes building string processing, parsers, etc. to a whole new level, while making you even more productive.

But, as the author had noted, he is a pythonista. Everyone has their biases (including me). I want to code correctly quickly, and not have things that shouldn't get in my way, get in my way. Including type systems, massive boilerplate/odd syntax rules, etc.

[1] It is 2017 ... structure by indentation is somehow, still a thing. It shouldn't be IMO.


I was about to ask you what kind of statically typed languages you were using that gave you this impression. But... "boilerplate type defs".

Yeah, C++ typing system sucks. Just don't think this generalizes to other languages.


Well, let's see.

Go: map[string]int64

Java: HashMap<String, long>

Rust: HashMap<&str, i64>

I think Haskell is one of the only statically typed languages which doesn't really care about algebraic typing. And yes, if you're talking about static typing, you get the baggage of dealing with algebraic typing to go with it as a rule. You can't arbitrarily separate the two until more languages follow Haskell's route.


It's not just C++ though. If I have to add logic to my code to handle explicit type changes (casting, et al.) for something that, honestly, is not important to my algorithm, then this increases the likelihood that I will make a type/casting/conversion error.

OTOH, a language smart enough to (correctly) infer this (Perl, Julia, etc.) generally won't have a problem with this, and will handle these issues for you. Julia still allows you to be very explicit on type, and force a hard type specialized code. The benefit in this case, with type specialization, is that it can generate far more efficient code for the specific logic.

That latter argument is, to me, the only real benefit of typing systems that I've personally encountered. I know people throw studies around claiming "fewer errors with stronger type systems", but ... to be honest ... I have not experienced this. Rather, in stronger typed languages, I spend more time hunting down type impedance issues than logic issues.

This seems to not be a positive benefit to me. I may be alone in this regard, or not. I don't know.

I do know that, like programming languages, editors, operating systems, this view point (pro/anti strong typing) tends to take on a religious overtone in the sense of people taking a position and digging their heels in over this. Elevating this aspect as an important point in an overall platform decision process (what should we develop in), when, maybe, it shouldn't be.

In an odd way, I've seen this in the industry for a while. There were previous incarnations of this. Like Dijkstra's famous "goto" comments[1]. I don't necessarily agree with this, and I argue that people can write bad code in any language.

Typing systems ostensibly are there to help us write less bad code (all code is bad and buggy, anyone telling you otherwise is trying to sell you some swampland). But once they get in your way, and you start spending inordinate amounts of time dealing with typing issues, you have to ask whether or not they are helping or inhibiting.

Put another way, anything that helps should, actually help, without adding significantly to the burden. Apart from what Julia does, and other dynamic languages do ... that some people write great code in ... I've not seen/used many other systems that strive to get out of your way while working.

Typing systems do not automatically make code better. Absence of typing systems do not automatically make code worse. This is a function of the algorithm, the implementation logic, etc. How disciplined are you as the developer in making sure your code is clear, concise, and actually reflective of the problem you are attempting to solve ... things that help with that are welcome.

[1] http://homepages.cwi.nl/~storm/teaching/reader/Dijkstra68.pd...


That falls down because most static languages have "representational types", not "semantic types". Here's a post of mine on the topic:

https://news.ycombinator.com/item?id=15144364


I do occasionally get problems in Python due to typing issues, but it's pretty infrequent. Nowhere near often enough to care about all that much.


You could have static duck typing, so declarations wouldn't be an issue. Though, for the most part I like Python just the way it is. I actually hate when a library tries to check types, instead of just using whichever object I passed it.


> You can't even reliably do a "find usages" of classes using a dynamically typed language.

Let me introduce you to Racket. It started out in life as Scheme, and became something more.

DrRacket, it's IDE, is fantastic. The main reason why? You can trace every function call, and every macro, to its definition with a quick mouse-over, even if said definition is in another file or library, even if said definition is only conditionally created.

There isn't really a reason a dynamic language can't do tracing like that, or mass renames across files... Because some already do.


Yeah, Common Lisp, Racket and Smalltalk all are dynamically-typed languages that offer these sorts of features. Racket can solve this through its powerful reader/macroexpander. The others do it by having you develop inside your run-time environment where you can take advantage of all sorts of interesting information only available at run-time.


MyPy provides some useful type-checking for python. I have not had a chance to use it in production code because we are still stuck on python 2 due to Google AppEngine standard not supporting python 3.


http://mypy.readthedocs.io/en/latest/python2.html

You can use comment annotations to typecheck with mypy in py2


>On the other hand, the idea that dynamic languages are more productive than static languages are laughable.

And yet it was common understanding just a couple decades ago. And not just from naive "script kiddies". Read what PG writes about Lisp in ViaWeb for example.

And it's not because we know we have faster to develop static languages, type inference, faster compiles (and also not because of Haskell etc). Those apply to insignificant minorities using Go and Haskell (insignificant compared to the huge hordes using Java, C#, Python, PHP, C/C++, and JS).

It's because of the fad mostly -- JS transpilers and everybody else going for moar types.

>Statically type languages prevent a lot of bugs and allow for a lot of automated provably correct refactorings that simple cannot be done with a statically typed languages.

Which is almost irrelevant when it comes to produce something -- you can get it to market faster with those bugs than with the extra rigidness.


From my experience, statically typed Python is quite a bit slower to write. It pays off for modules that see a lot of use in the codebase, but typing every bit of code can be a lot of work for little gain.


> that simple cannot be done with a statically typed languages

You meant to write 'dynamically'. Not too late to edit!


>Statically type languages prevent a lot of bugs

Correction:

Statically typed languages prevent a lot of the bugs that total beginners to programming would make; while they are useful in preventing the one, two or tree trivial, easy to solve bugs that a seasoned programmer would make once in a week.

They are also not useful at all to prevent the real-life, critical bugs that have nothing to do with types at all, and that are the ones that will negatively impact the product the most and will take serious time to get fixed.

You know what static typing is good for? It has nothing to do with "preventing bugs". It has to do with performance and compiler optimizations.


> They are also not useful at all to prevent the real-life, critical bugs that have nothing to do with types at all,

Most bugs are about types, just most languages’ type systems are too weak to express the logical types relevant to many important bugs (and, if they weren't, then you'd probably have to worry more the about bugs in the type-level code.)


> Most bugs are about types, just most languages’ type systems are too weak to express the logical types relevant to many important bugs

Ok, so you are writing a function for a fast approximation of ATANH using IEEE floating point standard.

In type: IEEE floating point number

Out type: the same.

The function will report an incorrect value (bug), yet still be perfectly type safe.

Tell me how the magical type checking of the super-good-type-system programming language you propose will catch the error and thus prevent the bug.

And also, we're discussing about real-life problems, tackled with existing programming languages, not languages that don't exist yet.

Not even Haskell, with it's great type system, will help you in such cases. Not in the bugs that are caused by a flawed understanding of the business model, the business rules, or of the system as a whole.


I pretty much agree with everything in the article - except for the bit where he tries to quantify why python is better from a developer efficiency perspective than other languages.

The main example he cites is a study that compares the amount of time writing string processing routines in different languages - which is quite a bit different from the work I do every day. I develop web apps which means I generally work in very large code bases, and spend most of my time modifying existing code rather than writing fresh code from scratch. I have found that statically typed languages (java + typescript) and the fantastic IDE support that comes along with them make it really easy to navigate around the code and refactor things. Also - the compiler tends to catch and prevent a whole class of bugs that you might otherwise only catch at runtime in a dynamically typed language.

Of course there are other situations where I prefer to use Ruby as my scripting language of choice - it all comes down to using the right tool for the job at hand. Unfortunately I don't think the author gives enough consideration to the trade-offs between static vs. dynamically typed languages, and I think he would have been better just leaving that section out as it isn't really necessary to prove his point that CPU efficiency isn't important in a lot of applications.

Ultimately though I completely agree with his main point: "Optimize for your most expensive resource. That’s YOU, not the computer."


Python is also heavily used in science, where performance really does matter. It's successful because of how highly ergonomic python apis can be built on top of optimised C/C++/Fortran libraries.

That said, there is clearly a desire to write 'fast' code in python itself without swapping to C. Cython helps, but to get really fast Cython code you actually have to write with C-semantics (so you are basically writing C with Python syntax).

Projects like numba JIT are interesting in that they can optimise domain-specific code (i.e. numerical/array code) that's written in normal python style. It also means jumping through a few hoops (although with the latest version in many cases all you need is a single decorator on your hot function). You can even do GIL-less multithreading in some cases.

Overall things are looking promising, with the addition of the frame evaluation API and possible improvements to the python C-api that could make JIT and similar extentions easier.


The author argues from his professional experience as a Python developer that it's fast enough, that you'll spend most time waiting for I/O anyway, that you can just throw more servers at the problem etc.

The problem is that his experience as a Python developer doesn't accurately reflect the prevalence of problems where runtime CPU performance actually is an issue. Of course not, because who in their right mind would make an informed decision to solve such a problem in Python? Python has worked for him because it is only useless for a category of problems that he hasn't had the opportunity to solve because he's a Python developer. Outside this professional experience, not everything is a trivially parallel web service that you can just throw more servers at if CPU time exceeds I/O waiting.

It all really boils down to what your requirements are, whether you have all the time and memory of a whole server park at your hands, or a fraction of the time available in a smaller embedded system, how timely the delivery of the software has to be and how timely it needs to deliver runtime results once it's up and running. There are times where Python just isn't fast enough, or where getting it fast enough is possible, but more convoluted and tricky than implementing the solution in a more performant language. Developer time may be more expensive than the platform that my solution is for, but that doesn't get around the fact that it eventually will need to run with the available resources.


Unless we are talking like circa 1999 I don't think I have heard a complaint yet that Python is slow. I'm curious who or where the author heard that from (not specifically the people themselves but the domain they are in).

What I have heard complaints about Python are (and I don't agree with all these points):

* Its not statically typed

* The python 2/3 compatibility

* It has some design flaws: GIL, variable assigning, mutable variables, lambdas, indentation (I don't agree with all these but this is complaints I have heard).

* The plethora of packaging (ie its not unified)

I guess one could argue its slow because it can't do concurrency well but that really isn't raw speed.

Then the author started comparing string processing of programmer time from a study which... doesn't help the authors point at all.

* Python has and will always be fast at string processing and most people know this

* The people that complain about python speed are almost certainly not doing string processing

* I have serious questions about the study in general (many languages have changed quite a bit since then)


For some data processing tasks Python can be brutally slow, especially text processing. NumPy is only fast because it's written in C and offloads hard numerical calculations to BLAS.


Yes but at least Python has very good native access to libraries.

For example I am sure Java is faster at numerical processing than pure Python at least with Python you have the option of writing in native. Python has really good C interop and it maybe considered by some not part of the language but I consider that it is.

Yeah you could write native code for Java via JNI but its not easy (granted its been a while) and for reasons I can't remember why but JNI is slow (probably moving stuff on and off the heap... I can't recall).


Decent, not very good. Ctypes is slow, cython requires extra boilerplate to become performant. Python/C API is even more annoying than that. (Compared even to something like JNI.)


> I'm curious who or where the author heard that from (not specifically the people themselves but the domain they are in).

In the telecom domain, I've dealt with data big enough that Python wasn't really feasible. Think 100 of millions of records in CSV format that need to be parsed and processed. Doing that in Python is going to be painful.


Its interesting you mention CSV as I had an issue with CSV with Java many years ago.

I was trying to cut a couple of columns out of a CSV and tried to write it in Java and for some reason either the CSV library was slow or maybe I didn't have the right buffered input but I ended up writing a python script that outperformed my Java code (which reminds me I meant to revisit that).

Anyway here is the python code (it might not be the exact code I used as I didn't update it but whatever):

https://gist.github.com/agentgt/1383185


Python is insanely fast at data processing and analysis because it has very fast libraries.

As a matter of fact, don't know if you've heard, but data processing it kind of like.. Python's thing...


You're violently agreeing with each other.

Python itself can be pretty slow. Doing image processing on data stored as list-of-lists-of-integers would be brutally slow.

On the other hand, numpy is an import away, and it can be quite fast, especially if it's been built with an optimized BLAS/ATLAS, etc.


By blazingly fast you mean 100x slower than C++ equivalent and only 20x slower is you're very careful to avoid accidental copies.

For reference, MATLAB is about 30x slower with no special care. Pure Java on Hotspot was 5x slower except it dies on big data input due to very slow GC and goes to 50x slow.

Source: handled big audio data from hdf5 database, gigabytes sized. C++ equivalent had no vectorization or magic BLAS or anything.


As I'll often say to these comments, then you're doing things wrong. Numpy code can be written to never leave the numpy sandbox, and at that point it should be as fast or faster than naive c++ (because you'll be getting SSE and stuff for free).

There's a reason almost all deep learning is done in python.


The first time you have to return into python all the gains evaporate. As I said, we improved python code a few times and it never even got near. What it did was a bunch of convolutions, dot products, multiplications, fft, gamma probability equations. Even vector comparisons used numpy. Used direct hdf5 interface numpy has and fast view interface for overlaps.

Numba choked on the code by the way and crashed unless views were removed. And then it produced much worse performance.


Not all data is a good fit for Numpy: some data is non-numeric or not a homogenous array.

> There's a reason almost all deep learning is done in python.

The heavy-lifting in e.g. TensorFlow is done in C++. Bindings to Python make sense because it is one of the few sanctioned languages inside Google, and it is widely used outside of Google and easy to pick up.


>The heavy-lifting in e.g. TensorFlow is done in C++. Bindings to Python make sense because it is one of the few sanctioned languages inside Google, and it is widely used outside of Google and easy to pick up.

That's exactly the same as with numpy. I'm not sure what your point is. C++ is also one of the few sanctioned languages inside google, as is Java.

>Not all data is a good fit for Numpy: some data is non-numeric or not a homogenous array.

I'm curious what kind of data you're working with that can't be represented and effectively transformed in a tensor (numpy array).


> That's exactly the same as with numpy. I'm not sure what your point is.

I was replying to "there's a reason why...". You didn't specify that reason, so from the rest of your comment I took it to mean that Python (with numpy) was fast and good enough to write deep learning stuff. That doesn't seem to be the case for TensorFlow.

> I'm curious what kind of data you're working with that can't be represented and effectively transformed in a tensor (numpy array).

I'm not intimately familiar with the internals of numpy, but my understanding is that the basic data structure is a (multi-dimensional) array of values (not pointers). That leads to a number of questions.

If you have an array of records (dtype objects), and one of the fields is a string, am I correct that each element needs to allocate memory to hold the longest possible value that can occur for that field? What if that is not known beforehand?

How do you deal with optional fields (e.g. int or null)? Do you need to add a separate boolean to indicate null?

How do you deal with union types, e.g. each record can be one of x types, do you make a record that has a field for each of the fields of those x types? Do those fields take up space?


>You didn't specify that reason, so from the rest of your comment I took it to mean that Python (with numpy) was fast and good enough to write deep learning stuff. That doesn't seem to be the case for TensorFlow.

Tensorflow tensors are numpy arrays, or are transparently viewable as such.

>If you have an array of records (dtype objects), and one of the fields is a string, am I correct that each element needs to allocate memory to hold the longest possible value that can occur for that field? What if that is not known beforehand?

Yes, although you can also store numpy arrays of pyobjects, which are arrays of pointers. You'll be able to vectorize the code, but you won't get the same performance improvements as with a normal numpy array, because that same level of performance isn't possible with an array of pointers.

Note that for most machine learning applications, you'd preprocess your string into a vector of some kind.

>How do you deal with optional fields (e.g. int or null)? Do you need to add a separate boolean to indicate null?

Yes, but I'm not sure when you'd do that. That is, again in most machine learning applications you'd be representing things as one-hot arrays or as some kind of compressed high dimensional position vector, where 0 would represent a lack of presence of some thing.

>How do you deal with union types

dt = np.dtype((np.int32,{'real':(np.int16, 0),'imag':(np.int16, 2)})

is a 32 bit int that can also be accessed as a 16 bit complex number via .real and .imag.


> Python is insanely fast at data processing and analysis because it has very fast libraries.

It doesn't have fast libraries for everything. E.g. for the use case I brought up, it doesn't (or at least none that I could find at the time). Not all data are homogenous arrays that fit in numpy.

And are those libraries written in Python? Or in C because Python is too slow?

If you take your reasoning, any language that can bind to C (which is pretty much any language) is as fast as C. That is not very helpful when comparing the speed of languages. A slower language will force you to drop into C sooner than a faster one.

> As a matter of fact, don't know if you've heard, but data processing it kind of like.. Python's thing...

Things like audio and video codecs, or crypto code aren't written in Python.


I'd agree with all the complaints you list, and I love Python, but it is definitely kinda slow. I'd put the speed of python below several of the items on your list in terms of priorities I care about.

I've done a lot of image processing in Python using libraries like PIL, numpy and opencv. Doing per-pixel operations is extremely easy in PIL - improving my dev time, but CPU wise the slowest of the bunch. I love prototyping that way, but I have to move to numpy or opencv or another language to speed it up. A recent program to do a slightly complicated color transform on a 512x512 image was taking me over 60 seconds with PIL. It was 5-10 seconds in JavaScript, and less than 1 second using numpy.


Wouldn't numpy be the default choice for this type of work in Python?


Yes, if you care a lot about performance. But there's a cognitive cost to numpy for me because I'm not completely fluent. It's much more difficult to prototype something, so I usually wait until I've iterated on a prototype and I know exactly what I want to do before I dive into numpy.

Numpy also has an inside-out order of operations compared to using vanilla Python, the reason it's fast is because you put the inside of the loop on the outside of the code. That can make it really hard to iterate on structural changes in your code. You can lose a lot of the dev time advantages by going numpy first.

Also a lot of people don't consider numpy to be a fair comparison when talking about Python's language performance. Because it's optimized compiled code underneath the Python binding, it's not representative of the speed of the Python interpreter.


Python is very developer productivity friendly, but degrades performance in weird ways and the methods to increase performance don't always make lots of obvious sense and often are the "idiomatic" way.

I haven't looked into the guts myself, but I'd bet that for similar or equivalent operations, the way the different syntaxes are handled under the hood are wildly different. e.g. function calls are much faster than method calls on objects.

Optimized pure python can look very ugly and non-idiomatic.


Most comparisons of popular web development languages will show a table where every other language out performs Python (except maybe Ruby) [1]. In reality, like the author pointed out, the benchmarks don't really matter when you consider network calls.

[1] https://www.techempower.com/benchmarks/


Author here. I mostly work with a lot of Java devs. They are hesitant to try python because "java is faster". Some use the static typing excuse, but oddly, the most common reason I hear people wont do python is because its "too slow". Perhaps I live in a weird bubble, but that's the motivation behind the article.


I'm a Java dev (and the one you replied to) and in the past when I didn't work at my own company the real reason Python was often not picked was because it was considered "a toy language" or "scripting language" (ie hard to maintain) and also because it didn't have vendor support like Java did. Of course this was like 8-10 years ago.

I still program and sadly prefer writing code in Java over Python even though I have tons of Python scripts in my ~/bin that I have written over the years. Mainly because I just know the ecosphere better.

And thats what most people should just admit why they choose a particular programming language. As long as its good enough... better... is just what you know more of... the cost of learning something new just to replace something new for marginal speed increase isn't worth it.

So when people say Java is slow to learn, not cool or slow to develop in because the language is verbose... I say just like you: "I Dont' Care" because learning the whole ecosphere of another language is goddamn expensive (that is languages are relatively easy to learn but all the libraries, best practices and tools is another thing entirely).


Thats a very fair argument. But lets say hypothetically that language X was more "productive" than language Y. Wouldnt it be a worthwhile "investment" to learn X seriously? Sure it might slow you down for a year, but after that it pays dividends.


Yes of course and for my own personal projects I will take the time often to learn new languages (e.g. Rust).

However when dealing with a team often with an existing code base getting everyone on board with "X" while potentially porting to "X" is fairly expensive. Its generally only worthwhile because of actual technical limitations and not productivity. Even the promise of safety (ie prevention of bugs) I would probably consider higher than productivity (although I suppose that is in way productivity).

Regardless productivity is really hard to measure and especially predict unlike other technical things.


But if it's all unjustified hype, unmentioned drawbacks, and empty promises, then your losses can be extraordinary.


Oh, I've heard "Python is slow" from Node folks, Java folks, Scala folks (in terms of productivity) and of course, die hard C++ fans. I'm a fan of Python but i also like languages with a stronger FP bent (and statically typed.)


> Oh, I've heard "Python is slow" from Node folks, Java folks, Scala folks (in terms of productivity)

I have hard time believing Java folks made an argument on productivity unless you meant for that parenthetical statement to be only applied to Scala. Regardless productivity is a far more complicated than raw speed (particularly if you get into maintenance as some languages are easy to pump out code but harder to maintain... oh perl.. the pain...).

And yeah I have seen the C++ folks complain about Python but this in my experience has been video game programming which is clearly not the domain the author is in or talking about (I think given the microservice discussions). Hence why I would like to know more about these folks complaining.


Could you elaborate on what are the complaints you've heard about "variable assigning, mutable variables, lambdas"? Just curious.


It unclear if you are introducing a new variable to the scope or mutating an existing one.

    # Am I defining foo for the first time or am I mutating
    foo = "stuff"
Python if I recall in the OOP sense combats this with requiring self.foo but this is not the case with other forms of lexical scoping such as loops with in loops etc.

And of course you can't define constants.

IIRC for lambdas its that they can't span multiple lines and I believe they are getting removed (I'm not a python 3 expert)?


> * Its not statically typed

Yes, that's why we use it.


"It doesn't matter than Python is slow, besides we can use compiled libraries to speed it up"

"People saying it doesn't matter that Python is slow are deluding themselves and preventing Python from getting faster like JS did"

"Python is inherently harder to optimize than JS since it has <very dynamic features>"

"Smalltalk/Lisp/etc are also very dynamic yet are much faster"

"The slowness of Python is harming the planet by being inefficient and therefore wasting more energy/producing more pollution"

Did I miss any arguments? I know certain topics are bound to attract some repetitive discussion, but "Python is slow" has been one of the worst.


"Because Guido said so" isn't on your list, but otherwise I think you got them all


> "Python is inherently harder to optimize than JS since it has <very dynamic features>"

Python is not a very dynamic language in the sense that you actually can't change a lot of stuff (and a number of the things you can change just segfault CPython). I think JS is more dynamic, for example. Or Ruby.


These are not my arguments, mind you; I don't know enough to make them.

You've piqued my interest, though: can you give me an example of those things that you can't change or that break CPython?


Things like the Carlo Verre hack (also a thing you can't change —any more— in Python: builtins), editing objects during their construction (via e.g. gc)... generally, the gc module allows other ways as well to crash your interpreter.

    >>> import gc
    >>> 'foo'.lower()
    >>> gc.get_referents(str.__dict__)[0]['lower'] = str.upper
    >>> 'foo'.lower()
    segmentation fault (core dumped)  python
(That's the method lookup cache)

A talk in this direction is https://www.youtube.com/watch?v=qCGofLIzX6g


"The slowness isn't worth the metaclass abuse" would be my take


Python is my Swiss army knife. I love it because it is a single tool that can aid in almost every project I do. But if I'm doing one specific thing a lot, I want that thing to be done well and done efficiently, so I'll reach for the specific screwdriver I need.

Also most of my problems are IO bound so single threaded concurrency is fine.

But I represent a very small portion of the global problem space.


The fact that Python is slow isn't its only problem. What I care more about nowadays is wasting my time hunting bugs that could have been avoided by a static type system.


I see this stated as if its a universal fact, but it really true that static types reduce over-all bug density?

I myself have found this to not be true, and I have found the same in reading / talking to others. However, if you have found a resource that has data on the contrary, I would find it very interesting to read.

https://medium.com/javascript-scene/the-shocking-secret-abou...


A static type system doesn't protect you from choosing the wrong abstraction, and all the bugs that result from it. But if used properly I think it does help.


How? I don't think I've ever gotten a type I wasn't expecting in any of my python code. It's just not a problem for me.

That is, past the first time I run it. I've been known to pass something an x when it wanted a y, but that's a "fail early, fail loud" bug and not an actual problem.


Unfortunately in my experience I don't know what type of x I'm supposed pass into to. I either have to guess what x is out of a infinite search space or I have to look at the implementation but unfortunately it also calls some other functions so I need to determine the possible inputs for that function and repeat this until I've read 100s of lines of code just to use a single function because someone thought that documentation is uneccessary and 10 seconds of less typing were worth the tradeoff. Static typing is statically enforced documentation. Sure you might be lucky and already have voluntary documentation available but I am not. I'm getting sick of randomly being stuck for 10 hours on some trivial problem caused by wrong incentives.


This just means that you already have an idea of what kind of object should go where, you just never wrote it down. And because you never wrote it down it becomes hard to generalise things when you eventually want to reuse some method, since you never made explicit which operations an object should support to be able to reuse that method.

Also you have no guarantee that what you think the types should be is actually consistent.


Well the obvious problem is that you have to test all of the code paths before deciding that your code is type safe. Static types do not have that problem.


Have you checked out MyPy and the optional type hinting? I've been going that route in my python recently and been liking the both/and of having an optional type checker.


I have, and I do like that Python is heading in this direction. But this feature does still feel half-baked to me, the error messages are often confusing and I do recall running into a few bugs along the way. I'm also not certain that the type hinting will ever be 100%, the dynamic nature of Python seems to prevent that. This could end up giving some people a false sense of correctness.


The _getting started_ part could certainly be a bit smoother, I have had a few issues as well. Generally speaking I think for in-house things you can probably alleviate a bit of the possible dynamic issues that you mention - just by discouraging too much meta-ness.


You only pay the the price while debugging. In your static language you pay the price continually. You are wasting orders of magnitude more time fighting your language's type system every day, you have to read reams of boilerplate code that are unrelated to the problem at hand, each extra line increases the attack surface and complexity of your code.


>You are wasting orders of magnitude more time fighting your language's type system every day

I program C# a lot, I never fight the type system. It helps me. It tells me while coding that something is wrong. In Python I'll catch this as well...eventually, when I run the code. A type error in C# would be a runtime error in Python. That's very inconvenient.

The static types are also documentation showing intent. It's hard to read old dynamically typed code bases, because I have to do the type processing in my head to understand what is what. Much simpler if I have nicely defined interfaces.


When I write a new function in Python I immediately evaluate it in a running REPL and test it. There's no eventually about it. A function either works or it doesn't.


Having worked in both dynamic and statically typed systems, I don't think this is true. A good statically typed language doesn't get in your way very often, and it doesn't produce any more boilerplate than doc comments do in dynamic languages. You do write doc comments, right?

In fact, I'd say that most of the time, a good statically typed language does the opposite of get in your way. It helps you w/ refactoring tools, better intellisense and editor help, self-documentation, etc. The older a codebase gets, the more helpful a static type system is.

That said, there are statically typed languages which produce a ton of boilerplate (Java used to, at least), and there are those that don't (F# comes to mind).

The only dynamically typed language I've used where I didn't regularly miss static typing is Clojure. (Elixir may fit the bill, but the jury's still out on that.)


This.

Let me add a radical point: static typing is often thrown out anyway, when you encode your data as strings, serialize/marshal it, create polymorphic lists and hierarchical data structures, etc etc

Meanwhile, disciplined use of python is perfectly easy to get right: - use pylint - be ruthlessly consistent in naming classes, methods and variables (which pylint helps btw,) - for large projects, consider mypy etc


If you're working in a statically typed language and cast everything to string / object you deserve all the bugs you get.


Mypy / the PyCharm typechecker have come a long way. It's not the same, of course, but you can get a good bit of mileage out of the new gradual typing system these days.


Please tell me, do you write tests? I've learned that it's necessary. I do 100% code coverage and I'm enjoying TDD a lot.


The whole point of static type systems is that they give you the "type tests". 100% code coverage just to get what the compiler would give you is a waste of time. If python is supposed make developers more productive, this is just dragging them down.


If you haven't already tried, I recommend to port [1] a Python program to OCaml, F# or Haskell.

There are so many tests you can throw away, so many corner cases you don't need to test anymore. (I'm talking about tests that you can't even write down without getting a compiler error.) Not to mention the assertions within your functions that aren't needed anymore.

And those little type annotations are so much simpler and easier to write down than corresponding tests.

If you really head for 100% testing, not just code coverage, but also all corner cases, you will love the modern static type systems. (However, you should really use an ML type system, because doing the same with Java or C++ is cumbersome and not much fun.)

[1] I did so for my mathematics diploma thesis, starting implementing my formalism in Python, having lots of trouble when refactoring, then moving all code to OCaml and then being able to refactor the code alongside the developing formalism in the thesis.


>I'm enjoying TDD a lot

Meh, test are great in so far as they can help you not break things if you upgrade the framework, or langauge in the future. But other than that I have to agree with Kent Beck's sentiment of "I get paid for code that works, not for tests".

https://istacee.wordpress.com/2013/09/18/kent-beck-i-get-pai...

If I had started out trying to learn Django using TDD as in this tutorial:

http://chimera.labs.oreilly.com/books/1234000000754/index.ht...

I never would have learned Django. I would have given up with the ridiculous slow progress of results and functionality.


Time/cost to locate and fix a bug: at compile time < in unit test < in integration test < in production.

Besides, tests cannot find all the bugs a good static type system can find, and vice versa. Tests, even at 100% coverage, are not a complete substitute.


100% coverage is meaningless with respect to correctness and the ability to prove facts about your code. 100% coverage only means that you've been able to exercise all relevant lines of code at least once, not that you've tested all possible execution paths through the code.

The idea that you could somehow replace the utility of static type checking with a suite of human written tests is laughable. Static typing systems allow you to prove facts about your code which make reasoning about correctness much easier than with dynamic type systems which practically disallow this (unless gradual typing is allowed in your language).


Static types and tests check different things. Tests are for making things function as desired. Types are for making sure they actually connect sensibly where desired. Sometimes you can use them for the other purposes though that often produces ugly tests/types.


Python's value to me has always been that it's easier to get things done, not it's speed. One time when I was interviewing a candidate for a coding job, the candidate said she loved Python the most "because you can just yell at it and it'll work."

It's both the breadth of the standard library and ecosystem, and the simple language design, that make developing things in Python faster for me.

Doing problems on Project Euler has been an education for me in how algorithm matters more than speed. Lots and lots of people spend hours writing long C++ codes that are easily beaten by a few lines of Python. It certainly goes the other way too, and the wrong algorithm in Python is even that much slower and more painful than the right algorithm in C++. But when the right algorithm is used and the problem is solved in a few milliseconds, it really doesn't matter which language uses more CPU cycles, all that matters is whether you saw the insight that let you skip 99% of the search space, and how much time you spend writing code.


Somewhat ironically, Python is used a lot for things that would benefit from raw speed (data processing pipelines) and do not benefit at all from dynamic typing (since the kind of property bags / data frame views over data are easily replicated in statically typed languages). But Python's C extension API is quite a bit easier than p.e. Matlab's MEX API (to me at least); can typical Python IDEs compile and relink extension modules without an external build step?

> Your bottleneck is most likely not CPU or Python itself.

With applications that are dominated by raw data processing, it's very, very easy to be CPU dominated. Hell, I had one quite trivial data converter for logfiles where the "parsing the printf string" part of Java's printf dominated processing and writing a custom formatter halved processing time (while regexes can be compiled, the format string cannot be precompiled and will be interpreted each time); it's one of those things where I would intuitively say "why did this moron write his custom formatter" if I stumbled upon it in a code review. Intuitively, you'd expect this to be a simple case of an IO dominated task (which it is now once the bottleneck has been removed).

If it's fire-and-forget batch jobs, you can get away with it, but if the converter is part of a user facing fat client application that runs on a old office laptop, you don't have that luxury.


The article could be titled: "Yes, Python is Slow To Refactor and Maintain, and I Still Don't Care".

I never understand why dynamic language enthusiasts primarily focus on new code only. You have to discuss all sides of increased or decreased productivity to make a rational argument.


Python is optimized for getting interesting things done in a few lines of code. Small scripts you write once and then forget.

For serious projects? IMO python is a disaster.


> Your bottleneck is most likely not CPU or Python itself.

I've found that this is often the case. Nearly always disk or network. But it's sometimes surprising how little work you need to do to become CPU-bound. This is the price we pay for such a tremendously dynamic language.

Indeed, the article's suggestions of C/Cython/PyPy are good ones to remedy the problem when it occurs.


I get the point this guy is making, but if you need something parallel for a cpu bound task, throwing more hardware at the problem isn't the most efficient solution if you can just use more cores. For example adding another quad core when the first cpu is only using one core anyway is inefficient and expensive.

Right tool for the right job I suppose.


Python does multiprocess very well. You can easily use all cores on your machine. Pythons main "disadvantage" is threading because of the GIL. But each process gets its own GIL. So when you multi process, your not limited to one core.


This. I had a problem where I needed to scrape roughly 20,000 html documents daily, which is normally a pretty slow task. You have to open the file, load it into memory, parse the DOM, and then run all of your selection methods. Sequentially, it took about 60 minutes daily. Multithreading slowed it down because it was CPU bound. Multiprocessing allowed me to run 12 processes across 8 cores. That took the total processing time down to about 4 minutes or so. And I was able to write the code in a day. Writing something similar in Java or C++ would have taken me a week.


The point is that (modifying existing code or writing new code to) "just use more cores" may be less efficient, for the business or organization that is employing the programmer, given that programmer salaries, over even a fairly short amount of time, can be more expensive than hardware.


"It used to be the case that programs took a really long time to run. CPU’s were expensive, memory was expensive. Running time of a program used to be an important metric."

As hardware gets faster we give it new tasks that could not be achieved before. Like rendering high resolution stereoscopic images using physically based shading at 90 FPS on relatively cheap consumer hadware (VR). There are still quite a lot of code that we call 'performance critical'. Most of that code is written in C/C++ (and CUDA and glsl, and hlsl, etc...) today.


Yeah, also we have a lot of code that a lot of engineers in our company run. Some of it takes hours instead of a few minutes to run, because it's a bunch of slowness combined. Sure you can say the engineer doesn't have to watch. But sometimes you need this result, and can't do much else until it's there. Doing a few loops at a few hours at a time is very time consuming.

Being lazy about slowness is bad in the long run, if you or your customers waste inordinate amounts of time, because one of your devs didn't want to think.

Sure you should optimize from the bottleneck, look at hot path, measure where optimization makes sense. But using tools that make some stuff 10x faster without much effort is not premature optimization. It's just not being stupid.


It's still expensive on client machines because most of the persons in the world are NOT software engineers with 6-digit salaries.

They run cheap computers with HDDs and Windows polluted by a ton of 3rd party crap. They don't know how to fix it and silently suffer.

I was cleaning a local vet clinic's devices recently – they were literally switching between two computers to not wait 5 minutes of non-responsiveness because some bloated software was occasionally consuming 100% of CPU.


A lot of businesses these days prefer web apps. It's not hard to understand why - all the hassle of system maintenance falls to the people who host the app and can afford to know their stuff. If your Windows PC is suffering from rot just replace it with a Chromebook.


"Just get money for a new device out of thin air and just replace all your paid or even cracked Windows software with subscription-based alternatives that will not work without Internet. Ah, also just relearn all your workflows."

Sorry, but that's how being in a bubble looks like.


I wasn't suggesting that businesses were anxious to replace things that already work, I'm suggesting that as they acquire new software it's more likely to be web-based.

Devices are often replaced on a schedule anyway, especially if they're leased.


I wasn't talking only about businesses in my previous reply. But most businesses on the planet aren't bathing in money either.

You are speaking as a citizen of a rich country where devices are relatively cheap and stable Internet is available everywhere.


The problem is not so much that python is slow. It's that in some scenarios python can't be made fast.

Fast prototyping is great but being stuck with a prototype for deployment isn't.


In ten years of Python development I have yet to come across an instance where Python couldn't be made fast. In some cases critical sections had to be delegated to C but even that is very rare.


>It's that in some scenarios python can't be made fast.

Can you give some examples of this? I mean, obviously with enough effort you can "make python fast" since it has good C bindings, and can just be a thin wrapper around fast stuff. Similar to how command line tools can be ridiculously fast[^1] despite, ostensibly, running in bash.

So I'm a bit confused about what you're claiming. Organizational issues, it's difficult to get management on board with an optimization pass?

[^1]: https://aadrake.com/command-line-tools-can-be-235x-faster-th...


My point is do you have competent C programmers on your team?

That said, there is a point in some python prototypes where you "hit the performance wall". For whatever reason, you'll need to look at one of the options to make python faster and none of them are painless unless you're already a serious C programmer.


Did you read the section on "Optimizing Python" ?? If so, I'm assuming you read about using CPython to migrate your package/program/module painlessly to C.

So given that, can you elaborate on your objection? I'd like to know why you think the article is wrong about optimising your Python to make it fast enough.


"migrate your package/program/module painlessly to C"

Do you or anyone on your python using team can use C professionally for things you'll use in production?

I work with python everyday on a team that uses python everyday, and I worked professionally in C++ for a >2 years time, and I still wouldn't trust myself to write a reliable/safe C backend to something.

And using C++ behind python with something like pybind11 is anything but painless on the other end; it requires careful consideration.

Python is "good enough" in nearly all cases, but there are some instances where you sort of hate yourself for using it for the prototype since you're forced to effectively rewrite afterwards.


You should try python-cffi. Using python code as a shared library under C is pretty easy, and calling C from python is pretty easy. The documentation is a bit crappy, but it's alright once you get the hang of it.

I recently used it to prototype a shared library that overrides the `getenv` syscall, with the intent of providing decrypt-on-use environment variables. It's pretty simple, stolen from lua I believe.

https://cffi.readthedocs.io/en/latest/overview.html#embeddin...


One of the best features of Python is the almost free FFI into C. This means you can prototype and then dial up the performance to 11 if required.


There's are still some big gains python could make, if python implementations were better.

Micropython is equivalent to a real-time cooperative-multitasking OS. If it had ~~better~~ support for things like cffi, you could implement posix on top of it. I can imagine a laptop that runs gnu+python in the next few years.

That's a whole new usecase, simply because that implementation uses a lot less ram. What usecases would we discover for a faster python?

Shared objects and proper sandboxing would also be huge.


"There's are still some big gains python could make, if python implementations were better."

At this point, I would find it far easier to believe that you are underestimating the difficulty involved in what it takes to speed up Python than that there are enormous gains yet to be had in speeding up Python. I suspect JS has had more optimization effort expended overall, but Python has still had a ton of work by lots of smart people, and generally got an earlier head start on optimization. (They didn't start trying to make JS "fast" right away; they spent rather a lot of time getting JS's hookup to the DOM in the face of things like .innerHTML working first, before anyone even cared to do what we today do routinely without thought in plain ol' Javascript, let alone with our glorious frameworks.)

There are enormous gains to be had in speeding up "a language that is like Python except certain things are banned", but people have already done that analysis too and discovered that broadly speaking, if you do that, too much existing Python breaks. If you want to see something like that, check out the RPython aspect of the PyPy project, which successfully implements a fast subset of Python. But it is a noticeably restricted subset of Python; AIUI it's not even close to something you can just drop in to your code and get faster speeds.

One of the things that I've learned from Python and the other attempts to speed up the scripting languages is that despite the mantra, yes, there is in practice such a thing as an intrinsically slow language. (The theoretical existence of a Python interpreter that can run all existing Python code at C speeds doesn't do us much good if we have no idea after decades of very smart people banging on the problem how to manifest it. Personally I'd suspect that while such a beast theoretically exists it has an exponential complexity startup cost or, if you prefer, exponential compilation costs. And probably a pretty decent code and/or RAM bloat cost, too.) And Python is one of them. Some of the reasons why it is so much fun to use are part of that intrinsic slowness. Some of them really aren't.

I personally think there's a lot of up-and-coming languages that are exploring the space of how to get those nicer programming abstractions and programmer-convenient code without paying anywhere near the runtime cost that the dynamic scripting languages of today do; it's one of the more exciting developments I see coming up. People complain a lot about code bloat and poor performance of our code since right now we have to choose between "fairly inconvenient but fast" and "convenient but slow and bloated". Patience! Better choices are developing, but they're still young.


I consider "python implementations being better" to be some simple things.

* pypy-level performance by default

* Much faster startup times (I'm looking at you namedTuple)

* Better libraries for specific use-cases, like interacting with opengl

* Rpython actually being documented so I can write library code in it

I'd also really like to see good sandboxing and fast-enough object sharing.

Those are the kinds of things I think can open up whole new market-segments. And python being used by more market-segments means more support for python.


I think the work being done on ruby with the truffle/graal project is interesting in this regard - it's my understanding that ruby is very dynamic too - but it's interesting to see what can be achieved by inclining c to ruby to c etc. It's a crazy stack, compared to something like red (redlang) - but it's amazing how close they are able to get to "magically fast compiler/runtime".

If they manage to combine tracing/warm-up with ahead-of-time compilation to executables... It'll be very interesting indeed.


Will you please share the languages you thunk are up and coming?


Go is an early entrant into this space, but I think part of the reason it is early is also that it is less ambitious. But to answer the ever-present question on HN about "why would anyone ever use this language?", something modern, almost as easy to use as a scripting language [1], and almost as fast as a compiled language, doesn't actually have a lot of contenders. (Old fogeys... like me!... like to observe that if you drop modern you have some things like Delphi that fit that slot, but they're all pretty much dead now, and Go has good support for concurrency in the modern processor environment.)

In the "you probably can't convince your boss yet" category I'd recommend Crystal (https://crystal-lang.org/) and Nim (https://nim-lang.org/).

Given the programming landscape and the general direction of things lately, I also bet there's a couple of serious contenders developing out there that haven't even hit HN yet.

[1]: For at least a broad class of problems. Put Go head-to-head with a problem someone would use NumPy for and Go will go down in flames in the ease-of-use and line count department. However I use Go for a lot of networks servers (not even necessarily Web servers, but network servers) and the line count for these comes out maybe 20% larger than Python, and it doesn't take much developer cognitive energy for those extra lines. I've also used Go for some command-line type apps where the line count is probably 50% over Python, but I also got some significant wins from the type system and concurrency, so, all in all there's a lot of things I can prototype with about the same mental effort in Go as I could in Python. Being able to declare interfaces that existing types conform to turns out to cover a surprising amount of those "duck-type" scripting-type cases.


This keeps getting posted, and while it makes some valid points, it's a lot of handwaving.

Arguably, other languages can get code out faster depending on the dev, language, etc.


Agreed. Things that are handwaved include:

1) Performance can be a genuine requirement of the product, i.e. if it's not fast enough, it doesn't ship. You can't ship faster and cheaper by sacrificing the thing you need to ship (well, you can, but then you're shipping a different product, not meeting the same requirements sooner; it's no different than cutting a feature).

2) Many processes can't be horizontally scaled in an efficient way, period. Not because the programmer is ignorant of some cool algorithm, but because the problem is fundamentally expensive to parallelize. Maybe you end up getting something like a 20% boost by having twice as many nodes, even after applying all the cool algorithms. And you don't necessarily get that scalability in your code base for free, either.

3) "Speed" in the mobile and embedded spaces is often as much about energy efficiency and thermal management as getting done sooner.

4) The metrics for deciding that Python is faster to develop in only measure small problems. People tend to shy away from Python for bigger projects, and the reasons for this are pretty hotly debated.


Definitely depending on the developer. That's why his disclaimer on the article's beginning: "I'm a Python fan boy". Nevertheless a very good read to me.


Many times when Python is blamed for being slow, it's the programmers fault. Python is great that you can 'regular' people writing code in it quickly. The problem is, these regular people don't always understand algorithms or things like caches, threads, databases...

A lot of these users can just say "My department needs a $40,000 24 CPU server with maximum RAM from MicroWay/SuperMicro, we need to run our codes faster", when they are just trying to brute force things.

They understand the problem domain but don't have the programming skills to use a computer to efficiently solve it.

But, these guys are all a step ahead of the ones who are stuck in the mindset of "C is the only language fast enough for my work", while not even understanding pointers and basic syntax and getting stuck on silly things like text processing, which could be done in minutes in Python.


Author here. Surprised to see this toping HN. Appreciate all the feedback. Let me know if you have any questions.


Yes, time to market is important. However, you don't need to compromise convenience of development for the sake of performance. If you twist your Python code to get performance it takes time. If you need performance, and like the syntax of Python then you should take a look at Nim [1]. With Nim I develop as quickly as in Python while I get the performance of C.

[1] https://nim-lang.org

I believe application performance is important on servers. It makes a difference if your Shop software written in Python is able to handle 50 requests per second, or if the same software written in Nim can handle 500 rps. And by the way, Nim provides static typing which helps a lot to catch errors at compile time.


Nim seems really good, but does it have a decent REPL these days? I'm not sure if it would be as convenient with a statically typed language, but I like the incremental development approach so much that I only use C if I absolutely have to.


It doesn't. But you can grab Aporia (or some other tool) to quickly compile and run some code, it replaces a REPL very well in my experience.


Premise: It's more important to be productive than to have fast code. Conclusion: Use Python. Is the premise true? For many cases- yes, but it depends. If you are running an application on the cloud and your metric is $/user/year and you have many users then saving some compute resources for each user gets attractive and you don't want to just throw another VM at it.

Is the conclusion true? Garbage collection gives big productivity gains. Other languages have GC. It's not nice to see your Python code die after a few days because you messed up the type passed to a function. Other languages fix that at compile time. Multicore is now. Other languages are built with better multicore awareness.


> without getting stuck in the weeds of the small things such as whether you should use a vector or an array

Yes, instead get into the weeds of tuple vs list

Not included in the graph of time-to-solve-problem static languages: statically typed languages with type inference


Given that they have exactly the same interface, that choice is really easy. You go with one until it turns out to be insufficient, and then you switch to the other and not a single line of code has to change, except at the point where you create the thing.

Incidentally, the same is true in many situations in Python, and that is (IMO) one of its strengths.


> However, this is no longer true, as silicon is now cheap. Like really cheap. Run time is no longer your most expensive resource.

Our client won't spend more money than a t2.medium instance on aws. Nothing we can do about it. In that case, run time does become an expensive resource.

But I get the point that OP is trying to make. Just wanted to mention that not all of us have the comfort of having enough resources on which our app runs.


> It’s more important to get stuff done than to make it go fast.

This is not a real absolute. It is only valid when what you have to run will not benefit a lot from performance or suffer a lot from lack of it.

The real guidance you can have in these matters is: how many times is my code going to run per second?

Some programs are written to be run once a day, others 10000000 times in a second. The first ones should be written in the language you're most productive in, the second ones in the fastest possible language.


Putting aside the discussion regarding productivity, there is a case where I have found execution time to matter. Scaling an application which uses an unsharded database. The long transaction durations and number of connctions were bottlenecking db throughput. The particular app was a Ruby/Rails monolith.


This sentiment is the reason for almost all software (especially on the web) beeing a load of crap. It's slow, it's buggy and developers always give the same excuse: CPUs and memory are cheap, therefore we can waste our customers time.

Imagine what we could do with the amazing hardware we have, if people started to do the sane thing and actually use the hardware to do things efficiently.


Giving EvE Online as an example is bad because that game artificially slow the game loop to keep up with the number of players, would this happen with C++ on a recent architecture? Probably not.


In "What if CPU time is an issue?" we could also mention the nim language (and not only cython) because it compiles (not only) to C and feels like python.


I know it's not trendy, but I would argue PHP is as productive for developers as Python and has a MUCH faster runtime, particularly after 7.


Slow doesn't matter when you scale horizontally.


Yeah? Well, Jimmy Crack Corn and I don't care.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: