More

jnotarstefano · on Dec 2, 2018

I think it's a typo: "Whenever Python EXITS, why isn't..."

paol · on Dec 2, 2018

I think so too, but that still leaves the problem that the statement is false. I was really puzzled by this one.

viraptor · on Dec 2, 2018

I think that OP is making a distinction between "deallocated by python's GC" and "dropped on the floor for the OS to collect".

Maybe a practical phrasing would be: "why does running Valgrind on a Python app show that some memory was never freed".

guan · on Dec 2, 2018

Is it because there might be another Python process running, that was started after this one started, that has mapped the same executable or shared libraries?

luminousmen · on Dec 2, 2018

Yes, it's a typo, will change

jnotarstefano · on Oct 22, 2017

It seems to be known as the Qeshm salt dome: http://wikimapia.org/11589307/Qeshm-Salt-dome

jnotarstefano · on April 27, 2017

I don't think that the link is pointing to the right press release. It should probably point to https://www.aps.org/publications/apsnews/updates/scoap3.cfm, which announces that APS will in fact join the SCOAP3 initiative.

(Another telltale sign is the fact that the link refers to Rolf Heuer as the Director General, but in fact he is the former Director General, while the current one is Fabiola Gianotti.)

tempay · on April 27, 2017

There are also the scoap3[1] and CERN[2] press releases.

[1] https://scoap3.org/aps_joins_scoap3/

[2] https://press.cern/press-releases/2017/04/cern-and-american-...

jnotarstefano · on Nov 20, 2016

That's an excellent idea. I wrote the copy on that page trying to convey "can I get your help against these bad people?", rather than something blaming the user, or a scary looking warning.

But showing them the _reason_ why a certain website is blocked can become an opportunity to teach people critical thought, something that other comment threads point out.

jnotarstefano · on Nov 20, 2016

I like "reality" : )

I chose "safety" because I think I copied Chrome's message when visiting a website with an invalid certificate.

jnotarstefano · on Nov 20, 2016

Given two clusters, I can predict the class of a new article by choosing the cluster whose centroid is closest to the article.

So, given some training data that produced two reasonable clusters with respect to the ground truth, I have a model that I can expect to generalize well on new data.

Now, this is not what that notebook shows, because it's missing the evaluating on testing data! The main point of the notebook is that the Jaccard Distance of the tokens of the HTML of the page, despite being very simple, appears to generate a reasonable model.

jnotarstefano · on Nov 20, 2016

Consider that I built it this night for http://lauzhack.com/, so it's not intended to be a fully-featured solution to the problem.

The main issue I wanted to address is the fact that such a blocker must be widely installed to be useful. Therefore, the Facebook message is intended to prod current users of the extensions to ask that their friends install it as well.

jnotarstefano · on Nov 20, 2016

Exactly. Those are the only websites I'm interested in targeting with this extension.

For example, I would never blacklist Breitbart, because I'm not interested in censoring political opinions I disagree with. I just want to free us from the burden of these websites that add nothing to the world and leech attention from everyone.

jjawssd · on Nov 20, 2016

> I just want to free us from the burden of these websites that add nothing to the world and leech attention from everyone.

The slope is very slippery. It is very easy to put forth an argument that some left/right wing media "[adds] nothing to the world and [leeches] attention from everyone"

Keeping the focus on blatant spam sites is a noble cause, but this tool is as easily misused as a firearm.

jnotarstefano · on Nov 20, 2016

I disagree on this slope being slippery (or actually, this being a slope at all).

Let me be more precise, and use tptacek's words in https://news.ycombinator.com/item?id=12999887: "fake news = spam sites built from scripts consisting largely of a backcatalog nonsense stories [...] with a one or two carefully produced fake stories as a "payload".

I think this definition is as objective as it gets, and clearly excludes things I might disagree with politically, but are not the target of this extension.

notahacker · on Nov 20, 2016

We don't have similar qualms about voluntary mass blocking of ads.

I think the reverse assumption - that everything should be considered a valid information source until conclusively proved otherwise - is probably a more dangerous bias than assuming that some media does add nothing to the world.

lj3 · on Nov 20, 2016

That logic suggests we should ban free speech and only allow certain people and organizations to be registered as valid information sources, preferably after passing some sort of test that shows they're legitimate and trustworthy.

tptacek · on Nov 20, 2016

I have an idea, why don't we just filter out all the obvious spam sites?

jnotarstefano · on Nov 8, 2016

I'll bite. Why is "3" the wrong answer?

tremon · on Nov 8, 2016

Because a complete rendering of the page includes fetching /favicon.ico.

tedmiston · on Nov 8, 2016

If there even is a favicon... and if a .ico file is considered an "image".

shakna · on Nov 8, 2016

If there is no favicon... There'll still be a request, it'll just 404.

Every modern browser I know of automatically tries to fetch favicon.ico.

tedmiston · on Nov 8, 2016

Good point. I suppose it's presumably async with respect to the render anyway. It looks even more nuanced on pages that have an iframe [1] (though presumably that's not the case here).

[1]: http://stackoverflow.com/a/13416784/149428

chaz6 · on Nov 8, 2016

On the other hand you only need 2 if you use css image sprites.

jnotarstefano · on March 31, 2016

I had to use a similar approach when creating a cluster analysis of the amendments in the Italian Senate [0].

The Italian Senate offers a SPARQL endpoint [1], which unfortunately doesn't offer access to the texts of the amendments. So I had to roll my own and create a small spider for them using Scrapy [2].

[0]: https://github.com/jacquerie/senato.py/blob/master/analysis....

[1]: http://dati.senato.it/23

[2]: https://github.com/jacquerie/senato.py/blob/master/senato/sp...