Is it because there might be another Python process running, that was started after this one started, that has mapped the same executable or shared libraries?
(Another telltale sign is the fact that the link refers to Rolf Heuer as the Director General, but in fact he is the former Director General, while the current one is Fabiola Gianotti.)
That's an excellent idea. I wrote the copy on that page trying to convey "can I get your help against these bad people?", rather than something blaming the user, or a scary looking warning.
But showing them the _reason_ why a certain website is blocked can become an opportunity to teach people critical thought, something that other comment threads point out.
Given two clusters, I can predict the class of a new article by choosing the cluster whose centroid is closest to the article.
So, given some training data that produced two reasonable clusters with respect to the ground truth, I have a model that I can expect to generalize well on new data.
Now, this is not what that notebook shows, because it's missing the evaluating on testing data! The main point of the notebook is that the Jaccard Distance of the tokens of the HTML of the page, despite being very simple, appears to generate a reasonable model.
Consider that I built it this night for http://lauzhack.com/, so it's not intended to be a fully-featured solution to the problem.
The main issue I wanted to address is the fact that such a blocker must be widely installed to be useful. Therefore, the Facebook message is intended to prod current users of the extensions to ask that their friends install it as well.
Exactly. Those are the only websites I'm interested in targeting with this extension.
For example, I would never blacklist Breitbart, because I'm not interested in censoring political opinions I disagree with. I just want to free us from the burden of these websites that add nothing to the world and leech attention from everyone.
> I just want to free us from the burden of these websites that add nothing to the world and leech attention from everyone.
The slope is very slippery. It is very easy to put forth an argument that some left/right wing media "[adds] nothing to the world and [leeches] attention from everyone"
Keeping the focus on blatant spam sites is a noble cause, but this tool is as easily misused as a firearm.
I disagree on this slope being slippery (or actually, this being a slope at all).
Let me be more precise, and use tptacek's words in https://news.ycombinator.com/item?id=12999887: "fake news = spam sites built from scripts consisting largely of a backcatalog nonsense stories [...] with a one or two carefully produced fake stories as a "payload".
I think this definition is as objective as it gets, and clearly excludes things I might disagree with politically, but are not the target of this extension.
We don't have similar qualms about voluntary mass blocking of ads.
I think the reverse assumption - that everything should be considered a valid information source until conclusively proved otherwise - is probably a more dangerous bias than assuming that some media does add nothing to the world.
That logic suggests we should ban free speech and only allow certain people and organizations to be registered as valid information sources, preferably after passing some sort of test that shows they're legitimate and trustworthy.
Good point. I suppose it's presumably async with respect to the render anyway. It looks even more nuanced on pages that have an iframe [1] (though presumably that's not the case here).
I had to use a similar approach when creating a cluster analysis of the amendments in the Italian Senate [0].
The Italian Senate offers a SPARQL endpoint [1], which unfortunately doesn't offer access to the texts of the amendments. So I had to roll my own and create a small spider for them using Scrapy [2].