Okay, I remember this talk, I'll try to write pseudo code for one of the real life bugs that nobody saw for 20 years (in Bugzilla that is, not some small script nobody knows).
$user = new User({
userid => $foo,
role => "user",
username => cgi->param("username"),
password => bcrypt(cgi->param("password"))
});
If you're seriously telling me you instantly spot the bug (if that were real perl, i don't know perl) without having seen the talk before, then congratulations, nobody else did, please start auditing perl software.
For anyone that didn't see the talk, no the bug isn't SQL injection or something like that (though that'd likely be possible too, watch the talk).
If you pass ?username=foo&username=bar then cgi.param("username") will actually return a list ("foo", "bar").
Now say you pass a list of three elements ?username=foo&username=role&username=admin ... yes seriously. It'll add the list into the hash and use every %2 element as key and the other as value. allowing you to overwrite stuff in hashes.
Of course, all of this is documented. But that doesn't make it not terrible.
PS: I don't hate perl. Perl -e is still very handy sometimes.
I'll bite... I have been programming in Perl for the bioinformatics community for about seven years.
I understand your code is pseudocode, but I should note that the "." is considered string concatenation in Perl. My very first thought before reading too carefully was that the bug was that someone tried to concatenate strings instead of calling functions. :) But I kept in mind that this was pseudocodish.
Yes, my second thought was, "He says there is a bug here; I wonder if this is one of those cases where the function returns values in list context." For any experienced Perl programmer, this is a thing to be aware of, especially when calling functions in hash maps, not only because it may return multiple values but also because it may return NO values, which would match the "username" key with "password".
We may debate the pros and cons of list context, but it is one of the first features I teach new hires because it can catch people accustomed to other languages by surprise. Perl is not the only language to support this kind of feature; it is present in Ruby with the "splat" operator (*array).
Sorry for the dot-confusion, i replaced it with "->".
As I said, it may be documented and known in the perl community, in my option it's still horrible. But yes, that's just my opinion of course.
> Perl is not the only language to support this kind of feature; it is present in Ruby with the "splat" operator (*array).
Unfortunately I also don't know ruby, but as far as i understand from what you're saying is that in ruby you need to explicitly tell ruby to "splat" the list in.
I really have problems coming up with usecases for this kind of behaviour, but yes, sure under some curcumstances it might be nice. But i'd argue it's the minority and so having to explicitly "whitelist" it would make sense. To avoid exactly this problem, of accidentaly calling a function that might return a list.
The problem that I also see is that at some point you could refactor a function that previously couldn't return a list.
Except maybe that's actually a thing in perl, to use this behaviour as a feature, I don't know.
Not explicit enough. Perl is the only language I've ever seen where
sub f { return 1, 2 }
my %x = (a => f(), b => f());
produces a three-element hash that has '2' as a key and 'b' as a value with no warnings. "Even offsets are keys" is far more error-prone than an actual syntax for hash literals, and splicing a list (not a hash!) into a hash doesn't need syntax this terse because it's hardly ever what I wanted. I think even PHP is more sane here.
I'm not sure how an appeal to PHP as lowest-common-denominator of language sanity argues against understanding the return values of functions you call.
> understanding the return values of functions you call
Sure you should. But confusion between keys and values isn't useful, so there's no reason for a subexpression to be able to do that, and in PHP it can't.
If you are interested (feel free to ignore if not)... Here are some examples for understanding list context and some of the things it can do for you. One funny way to think about it is that in Perl, you store array, array ref, hash, and hash ref structures, and you do not store lists. Lists are a way to express in code, distinct from the actual storage.
Perl's fat arrow => is the same thing as a comma, except that the thing on the left of it may be a bare word under "use strict". In other words, it has little special meaning and helps you to construct a list of things.
my $arr3 = [1 => 2 => 3, 4 => 5]; # $arr3 is [1, 2, 3, 4, 5]
For readability, when we construct hashmaps, we use the fat arrow so that you can more easily distinguish key from value.
my $hash1 = {x => 1, y => 2}; # $hash1->{x} is 1
But you don't have to use the fat arrow:
my $hash2 = {'x', 1, 'y', 2}; # same as $hash1
An array used in an expression like this is considered to be in list context:
my @pairs = ('x', 1, 'y', 2);
my $hash3 = {@pairs}; # same as hash1
Since we can boil about anything down to expression in list context, we have a lot of concise flexibility on constructing arrays and hashes and a variety of ways to express the same concept. Here is a fun use case: Build a hashmap where the keys are the numbers from 1 to 100, and the values are all the string "foo". Other languages perform an elegant dance with infinite cycles and zippers. In Perl, knowing the list context magic, this is a possible solution that shows that "map" may not return the same number of elements as the input list/array:
my %answer = map { $_ => "foo" } (1..100);
Let's join together two arrays with the letter 'x' between them:
my @result = (@array1, 'x', @array2);
Take an array reference, and return a reference to its elements cycled once (this specific problem is awkward in Perl and easier in other languages with nice array operators):
my $doubled = [@$ref, @$ref];
Construct a hashmap where some of the keys are completely optional. This is also an alternative expression for the "many optional parameters to object creation" problem, which most languages let you resolve by allowing you to instantiate the object with defaults and then alter the defaults with accessors in "if" clauses afterwards. In this example, if $self->y is false, then there will not be a "y" key.
my %params = (
x => $self->x,
$self->y ? (y => $self->y) : (),
z => $self->z,
);
MyClass->new(%params); # this actually converts the hashmap into list context!
Tip of the iceberg. Note that I am not defending, just demonstrating. I am on the fence on if I prefer the Ruby explicit style over Perl's style. For one, in Ruby, the fat arrow actually means that you are now constructing a hashmap. In Perl, since it is just a comma, I can exploit it to mean new things, depending on my context.
use Moose;
has age => (isa => 'Int', required => 1);
use Test::More;
subtest 'Math is still valid' => sub {
is(1, 1 => "1 is still equal to itself");
is(2, 2 => "2 is still equal to itself");
};
Just to be completely clear here. The problem with that talk is not the bug. That bug is legit (although it was caused by an inexperienced developer who didn't read the manual).
The main problem with the talk is that almost every single thing he says is wrong, except for that bug. He may have reported it, but he knows almost nothing about the language, and is complaining because he never actually learned Perl, but expects Perl to conform to his expectations.
The fact that he pretends he is correct makes it even worse, since it means he is spreading his ignorance and misconceptions to others who take it as fact.
He is like an english person complaining about the fact that pronouns are gendered in german.
> That bug is legit (although it was caused by an inexperienced developer who didn't read the manual).
Well that's exactly the problem, as I said before, the fact that this is documented doesn't make it not horrible. At least to me, as a non perl person.
> He is like an english person complaining about the fact that pronouns are gendered in german.
Valid complaint in my opinion. And that's my mother tongue.
You're making the same mistake he did. You don't actually know the thing you're talking about (Perl), yet you feel the need to, as first action, make factual statements about it, instead of your first action being to ask WHY i don't think it's horrible.
Also, regarding the gendered pronouns: They perform a very valid function in german, which i can understand you may not be consciously aware of: They serve the disambiguation of similarly sounding words (die Schüssel, der Schlüssel), similarly to Perl sigils.
Back to the issue of whether the bug is horrible. Any Perl developer learns these simple facts very early on:
- a function can return 0 to many values, depending on what kind of list of arguments you call return() with
- the contents of a list depend on the sigils in front of the arguments used in the list constructor
- if the sigil is $, then the list will have at the place of that argument exactly one value
- if the sigil is @, then list will contain zero to many values at the place of that argument
- if the sigil is none, then it's a function call, and the list will, following the first factoid, contain zero to many values at the place of the function call
- you must do two things with functions when calling them in a list constructor:
- 1. manually force them into scalar context in list construction by wrapping the call in scalar()
- 2. read the documentation of the function to see if the function makes any promises of its output
In the case of the bug you have an assignment to a hash, by way of a list constructor. That list constructor calls the cgi function params(). That function is documented to be able to return multiple values.
The programmer in this case didn't take care to read the documentation of the function, and this is not remotely the same thing as having a documented bug.
To the average Perl developer all this information is known and simple.
There's two glaring and obvious bugs right there that would be there if you did a straight port to any other language. Alwaysvalidateuserinput. If you ran this with taint mode on in Perl it'd fail, so you wouldn't even need human eyes to look at it to spot the problem.
Sure. Especially in OOP code i'd say it's not uncommon to fill the data into an object and then validate the whole object in one go.
You might not check again if "role" is still the same, because you wouldn't expect that the user can have any influence on it.
If you were a developer familiar with Perl you'd hopefully at least test with perl -T
If you had no familiarity with Perl I could see you getting bit by that. After you'd learned enough to be proficient enough to be writing code taking actual user input in a production env. you should really know about that issue since it's discussed in the Perl dev. community often enough. Also by that point you should be at a minimum testing any and all code taking actual user input with taint mode which would point out that problem immediately.
If you're seriously telling me you instantly spot the bug (if that were real perl, i don't know perl) without having seen the talk before, then congratulations, nobody else did, please start auditing perl software.
I've fixed that bug in code as far back as 2000. There've been discussions on that anti-pattern on PerlMonks for at least a decade.
This was all covered on HN at the time[1]. I re-post my comment[2] (in direct response to a comment from the speaker) here because it's relevant, but the TL;DR is that he looked in a bunch of very old projects for the use of an obfuscated module and made a talk not about how they were using old, well-known to be bad features, but about how there's a security hole in Perl. It was not a good talk.
From the changelog of the latest release of Perl: "CGI has been upgraded from version 3.63 to 3.65. NOTE: CGI is deprecated and may be removed from a future version of Perl."
You can still make arguments about list handling in Perl, but using a deprecated module, which has been understood by the community for years to be problematic and exists mostly for backwards compatibility, does not bolster your argument, it does the opposite.
Using large monolithic projects that were started 10+ years (Twiki: 1998, Movable Type: 2001, Bugzilla: open sourced by Mozilla in 1998) ago also doesn't lend itself well towards pointing out modern usage of Perl.
Feel free to make any argument you want, but you should use relevant examples if you hope to be taken seriously.
For anyone that didn't see the talk, no the bug isn't SQL injection or something like that (though that'd likely be possible too, watch the talk).
If you pass ?username=foo&username=bar then cgi.param("username") will actually return a list ("foo", "bar").
Now say you pass a list of three elements ?username=foo&username=role&username=admin ... yes seriously. It'll add the list into the hash and use every %2 element as key and the other as value. allowing you to overwrite stuff in hashes.
Of course, all of this is documented. But that doesn't make it not terrible.
PS: I don't hate perl. Perl -e is still very handy sometimes.