I think it is pretty hard to compute the cost, because what's the counterfactual? Even if you don't force a shutdown people may still opt to isolate themselves and businesses may opt to close, and the degree to which they do so is probably dependent on the dynamics of the pandemic, so you're going to be pretty sensitive to modeling errors.
So if you know of 10 different events which may occur with probability 10% each, you would prepare for exactly one of them?
I think it's clear that there are some events that may be rare but which are cheaper to plan ahead for than to deal with after the fact, and vice versa (e.g. for many people self insuring against minor losses like appliance failures makes sense).
But we can all aspire to be a >1 developer, even if we're not one currently. I don't necessarily think that contentment with being below average is universally bad (maybe it's perfectly rational given one's preferences for leisure vs. work, for example), but I'd hope that most <1 developers are juniors who hope to improve over time.
But writing code is only a small part of a software engineer's job (at least that's true everywhere I've worked), so selecting people who have opted to dedicate their lives to only that portion might be a worse idea than hiring better-rounded candidates, when it comes to building real software systems that need to be liked by users.
Note that some of this research, especially early, overstated the 'bias' here because they didn't realize that the default 'analogy' routines specifically rule-out returning any word that was also in the prompt words. So, even if closest word-vector after the `man->woman` translation was the same role (as is often the case), you wouldn't see it in the answer.
Further, they cherry-picked the most-potentially-offensive examples, in some cases dependent on the increased 'fuzziness' of more-outlier tokens (like `computer_programmer`).
You can test analogies against the popular GoogleNews word-vector set here – http://bionlp-www.utu.fi/wv_demo/ – but it has this same repeated-word-suppression.
So yes, when you try "man : computer_programmer :: woman : _?_" you indeed get back `homemaker` as #1 (and `programmer` a bit further down, and `computer_programmer` nowhere, since it's filtered, thus unclear where it would have ranked).
But if you use the word `programmer` (which I believe is more frequent in the corpus than the `computer_programmer` bigram, and thus a stronger vector), you get back words closely-related to 'programmer' as the top-3, and 23 other related words before any strongly-woman-gendered professions (`costume_designer` and `seamstress`).
You can try lots of other roles you might have expected to be somewhat gendered in the corpus – `firefighter`, `architect`, `mechanical_engineer`, `lawyer`, `doctor` – but continue to get back mostly ungendered analogy-solutions above gendered ones.
So: while word-vectors can encode such stereotypes, some of the headline examples are not representative.
One thing I’ve been tempted to research but never had time for myself: can one use that aspect of wording embeddings to automatically detect and quantify prejudice?
For example, if you trained only on the corpus of circia 1950 newspapers, would «“man” - “homosexual” ~= “pervert”» or something similar? I remember from my teenage years (as late as the 90s!) that some UK politicians spoke as if they thought like that.
I also wonder what biases it could reveal in me which I am currently unaware of… and how hard it may be to accept the error exists or to improve myself once I do. There’s no way I’m flawless, after all.
> For example, if you trained only on the corpus of circia 1950 newspapers, would «“man” - “homosexual” ~= “pervert”» or something similar?
If it did, what conclusion would you be able to draw?
As far as I know, there's no theoretical justification for thinking that word vectors are guaranteed to capture meaningful semantic content. Empirically, sometimes they do; other times, the relationships are noise or garbage.
I am wholeheartedly in favor of trying to examine one's own biases, but you shouldn't trust an ad-hoc algorithm to be the arbiter of what those biases are.
I think there's a further problem that there's never been a shortage of evidence, about things like this. The point is, prejudice and discrimination are not evidence-based in the first place. People who support existing unjust structures are generally strongly motivated to turn a blind eye. Even people who don't support them are - it's simply far easier and more socially advantageous to stop worrying and love the bomb.
I think this is a large part of what goes on in the digital humanities - to varying degrees of success. The problem, as usual, is not that there isn't an abundance of evidence. It's simply that nobody reads sociology papers except sociologists.
In this formulation wouldn't Basilica reflect the existing biases of the organization?
resumes of candidates
- resumes of employees you fired
+ resumes of employees you promoted
---------------------------------------
= resumes of candidates you should hire
It's a lot of hard work to reduce bias in promotions and terminations.
Basilica might reinforce that hard work when evaluating candidates.
Or you could use the techniques described in your citation to allow Basilica to help de-bias the hiring process.
I would certainly dispute that programming time is necessarily slowed down. Ironically, I find myself thinking much harder about types in a dynamic language, precisely because there's no compiler there to do the rote work for me. Lately I've been using pandas dataframes quite a bit, and the number of hours I've wasted due to problems with types is mindboggling.
It is, unfortunately, impossible to prove a negative, not that that's ever stopped anyone from trying. For example, though, a cursory Google turns up no instances of Paul Krugman or Larry Summers mentioning it, which I would expect for something which "most economists" understood to exist.