Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1st party ads are not unblockable. They only lack one aspect that helps identify them (the 3rd party hostname). But they still can be dealt with.

One way browsers try to take away that freedom is by limting what extensions can do. If that continues, at one point we would need a new browser to accomplish it.

My favorite vision of the future would be if Debian would provide a version of Chrome or Firefox that: a) is stripped of all tracking and b) gives extensions full access to everything.



At some point, you’re going to have to apply spam detection techniques rather than whitelist/blacklist ones.


As long (and where) labeling ads is mandatory, there will always be a way to identify them.


So what is the strategy to deal with legitimate 1st party subdomains and tracking/ads subdomains if they use random strings as identifiers? (I am guessing this is where we will need a combination of crawlers and machine learning algorithms)


Couldn't you blacklist all subdomains of the 1st party and whitelist the few that are actually real?

Or, assuming they have a small list of subdomains that redirect to ad servers, you could generate a list with a script that checks all their subdomains and creates a block list based on that. For example, the site discussed in the OP has all their subdomains listed here: https://crt.sh/?q=%25.liberation.fr

Edit: looking at the OP case, it seems like they only have one ad domain. I'm not sure I see this as a serious issue until multiple sites start rolling out thousands of subdomains, some pointing to back to the real server, others pointing to the ad server. Maybe that will happen but it's a pretty big barrier to entry, and just short of proxying everything through the 1st party.


> whitelist the few that are actually real

I'm speculating that the balance is in the reverse favor. Last night I was looking at some file on GitHub which was redirecting to what looked like an S3 bucket subdomain named with a pattern like "github-production-f7e281a2", which I simply presumed to be cache-busting via subdomain instead of appending the hash to the filename. If my assumptions were correct, every time GitHub deploys a new build, you would have to whitelist that subdomain.


Looking for suspiciously high entropy values compared to ones native language would be one way.


Devil's advocate: then instead of using subdomains with randomly generated strings, we use words from a dictionary instead.


that won't work: for instance https://twitter.com/aeris22/status/1193644687950860289 (securite means security/safety in French, but that subdomain is a CNAME for smartadserver)


Then we block those words :-)


You would have to block entire wordlists to combat subdomains like that. It would make more sense to whitelist subdomains instead, but it would require much more effort in order to determine what subdomains are required for the website to function. Additionally, if the site in question ever decided to change anything around, someone would have to catch the breaking change and have it corrected on the whitelists for the site to function again.


How do you know what words to block?


Machine learning by analyzing what displays on the page by blocking different domains. Bots can be automated to do that continuously and update a decentralized database with such information.


Chrome isn't open source, so not much chance of that happening.



Chromium is not Chrome. Chrome is based on code for Chromium, but that’s where the similarities end.


Chromium is Chrome with only a few proprietary bits removed. It's essentially Chrome everywhere that it matters to this particular discussion.


And why would you use chrome instead of chromium? Stick to Firefox and Chromium.


Because one likes the features available only in Chrome? I haven’t check recently and don’t know if this is still accurate, but Chromium used to not have the PDF reader and DRM support (for Netflix, etc.)


There are a number of foss and proprietary pdf readers for chromium/chrome. There are also netflix apps outside of chrome. You don't have to use Chrome...


That's like saying that the Ubuntu kernel is based on code for Linux.


Yes, pretty much, except that Canonical is nice enough to open source their patches. And they layer a ton of patches on top of the official kernel trees, mostly backports but also some new features. Their linux_5.0.0-36.39.diff is close to 35MB.

And remember the time when Debian layered some changes on top of openssl? http://faq.caslavka.cz/attachments/196/randomness.png

Now, what changes does Google layer on top of chromium to make Chrome? Do you know exactly?


Yes, it's pretty easy to disassemble it and find out. It's basically auto-updates, some closed-source extensions like Chromecast (although you can manually download the Chromecast bits for Chromium if you'd like), some branding differences as compile-time #defines. https://chromium.googlesource.com/chromium/src/+/master/docs...

(Do you know that all of Chromium is in fact open source? Have you looked at the source and the build process? Are there any parts in it that are actually precompiled binary blobs?)


Does Chromium contain the DRM needed to play sites like Netflix? I know many here are against DRM, but it’s necessary if I want to use my Netflix.


Netflix will play a 720p version if you don't have a DRM supported browser. Also netflix distributes native apps to all platforms, which means you don't need chrome to use netflix in full fidelity on any device except maybe linux?


Widevine is a DLL bundled with Chrome. You can copy it into a Chromium installation and use Netflix. I don't know if this violates TOS/licenses/law.


You realize the extensions can track you too?


This problem can be solved easily with using an open-source extension that has reproducible builds. Make sure it doesn't have built-in tracker as easy as looking to the source code. And we can make sure the final hash (without software signature blob) of the extension is the same as your built, so it is not tempered before uploaded to the extension store.


You can also track people if they install your adware.exe. The emphasis on install. What software you install is an entirely different threat scenario then visiting a website.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: