AdFlush(dl.acm.org) |
Features can be brittle, but they are understandable. The paper's appendix [1] lists the 27 features that will likely make a request/resource "ad-related". These include interesting ones like JS AST depth, average JS identifier length, the "bracket to dot notations ration in JS", and a number of graph measures for the graph of scripts.
And contrary to what comments in this thread are saying, they do compare against a blocklist-based adblocker: uBlock Origin. That's in section 5.5. They say they outperform uBlock Origin. But even they say they don't reduce overall page time bc their algorithm is expensive.
The superior score was an F1 of 0.86 vs 0.84 for AdFlush vs uBlock Origin, and it's not clear to me that this is a statistically significant difference. They do not claim it is.
For Brave browser users, you can see what hardcoded lists you're using at brave://adblock .
As for the whole cat and mouse game, how to detect an "ad" if it's served with the content fully sever-side? Now _that_ needs some serious ML to decipher.
This has been my red line on where I will allow ads vs blocking them. If a site is hosting their own ads, that's acceptable to me. If they are using an ad provider, that is not. The newspaper example is my go to. If you wanted your ad in a paper, you called the paper and took out an ad. Today's equivalent would be every time you opened the paper, a slight delay while it randomly chose the highest bids for the ad space while potentially also inserting something that would slowly eat your hands. That's a nope.
You are obviously in the camp that feels entitled to be able to read anything at anytime without allowing for a website to earn money by wanting to block all ads regardless of their origin.
I think the authors want to compare apples with apples, so they only compare their algorithm to other adblockers that use algorithms, as opposed to those which use crowdsourced lists. The paper somewhat acknowledges this:
> However, manual maintenance of these filter lists requires significant human effort
Seems like one of those tasks where crowdsourcing scales so nicely (only one person has to report an ad for it to go into a crowdsourced list that blocks it for millions of others) that it makes an algorithmic approach unnecessary.
The only way to avoid native ads is to stop consuming content that relies on ads.
Neat results, I wonder how it compares to uBO or the different blacklists. I assume it self-update with newer techniques and can detect certain patterns?
If I recall, in Permutation City there's some part where somebody deals with spam with AI. The user tries to use a simulation to listen to potential spam to filter it, while the spam tries to figure out whether a real person is listening to it and only tries to spam when a real person is there.
Or something along those lines, it's been a long time since I read it.
The harder, more pernicious type of ads are the modals that pop up when your cursor moves toward the back button, or when you scroll down a certain distance on the page. "Wait! Before you go, take a moment to give us your email address!"
Those can be blocked, but by the time you've seen them, they've already done all the damage they can do—which is to say, they've annoyed you.
I wish somebody could come up with a way to detect and stop them. I spent an afternoon trying to come up with reusable techniques to detect these popups, but there are just too many possibilities.
There are few things I feel radical about, and Ads are one of them. I believe they are a drain in several ways:
They waste computational resources and electricity on both ends. They compromise the visual design and layout of webpages. They distract and take mental energy away from the user. They make the internet (and anywhere ads exist) more "ugly" and less aesthetically pleasing - which negatively impacts mental health. They often sell low-quality services/products or outright scams, which harms those least educated and poorest individuals.
Death to advertisement! On billboards! On television! On the internet!
Ads are a parasite on the human mind that need to go away, forever.
It's a spectrum: Some level is an unavoidable part of communication ("I like dogs" forces you to think of dogs) some more is considered normal and traditional manipulation ("My food smells nice, that makes you hungry, wanna buy?") and then it goes on into grey-areas, scams, and eventually to potential extremes like "this image induces nausea" or "this sound knocks you out".
I’m building a pi-hole type solution for myself and essentially want all the filtering and blocking to happen at my firewall and not on my client (phone, laptop, tablet).
However you might end up using
1. pi-hole on router
2. Adguard as device level DNS
3. UBO on Firefox (android only)
It is possible but not recommended and wasteful. 1/2 and 3 is enough.
That's why I use uBlock and PiHole, which I deem is enough.
I'd like to see a tandem uBO+AdFlush extension that just enables uBO by default, with a "I still see ADs!" button in the extension UI that refreshes with AdFlush enabled and auto-submits any missed ads to a new FlushList filter list.
> This project is a half-serious, half-assed attempt to demonstrate that in the next few years the process of blocking this type of content could be almost entirely automated. Yes, it would be wasteful from a computational and human potential perspective, and otherwise completely unnecessary, but hey, more money would change hands!
https://github.com/SKKU-SecLab/AdFlush/tree/main?tab=readme-...
But since the first webpage I tried still had huge ads, I turned uBlock back on ;)
... Has anyone even heard of these ad blockers before?
[1] https://arstechnica.com/gadgets/2023/11/google-chrome-will-l...
Is it that easy? Sounds very abusable
It's easier to get a domain added than removed. and for the "corruption"/"rackeetering" part, it's a "win-win" for the adblockers and the list maintainers.
Adblockers also often pay browsers to be integrated by default (AdGuard, Adblock Plus, etc), and then they negociate with publishers to whitelist some domains (not necessarily the most obvious, can just be analytics).
"We offer your domain to be unblocked on xx millions of devices by default, this will create you a uplift of revenue of +yy%"
i had to create a ticket in a repo explaining why blocking a whole domain instead of a single subdomain was actually pretty bad. they approved it and reverted the change.
finding where exactly i had to open the ticket and what to write was a “down the rabbit hole” experience.
Not at all. I use Brave and "shield down" websites that I like and generally keep their ad situation under control (incl. 3rd party). But your point of hosting vs 3rd party is a good one and especially because often one 3rd party connects to another.
Likewise, I "block" annoying parts of websites like Yahoo Fantasy Football's enormous top nav that's not even an ad.
I basically have all my devices use it when I am on my network, and when I am off my network, my Wireguard connection (or Tailscale depending...) uses my home DNS server.
i don't really buy your argument
Sounds like perhaps your task was to ensure a company's ads got through an adblocker?
they were blocking a whole domain instead of blocking the ad-serving subdomain.
the issue was rectified, the main domain was replaced by the ad-serving subdomain.
The default lists used by uBlock for example include things like error tracking telemetry, Sentry for example.
I can see why people want to block that stuff (privacy) but it’s not exactly an “ad”
I hope we'll not end up in a DRM-like system where ads are somehow really baked in and content stops working for lay-people if they try to circumvent ads.
You could download the Chromium source and patch it to change the extensions APIs (or better, just use Firefox), but the majority of users won't do this, and extension writers aren't going to make a version for a patched Chromium browser unless it has significant market share and support.
[1] https://nordvpn.com/blog/manifest-v3-ad-blockers/
[2] https://www.eff.org/deeplinks/2021/12/chrome-users-beware-ma...
Unless you’re running that 20%, Google controls it, and they basically write the standards anymore.
I thought this is rather obvious, at least for those worried about experience. Do you think all those who realize they're suffering from ads don't think about using non-Chromium browser?
Under manifest v3, extensions are not able to dynamically inspect requests, instead, they may only apply rules to net requests. Even worse, there is a limitation of only 5000 rules per extension!! [1]
Even WORSE worse, under Chrome's manifest v3 rules, the extension cannot load any external code! Meaning that blocklists must be packaged with the extension. [2] Now, one might consider the reading of that link to no affect block lists, it's not a "library" and it's not "code" so long as it's just a list of textual rules.... however, google considers the following to be a violation: "Building an interpreter to run complex commands fetched from a remote source, even if those commands are fetched as data". [3]
Sneaky sneaky. An extension update (and hence new app store submission) is required to update filter lists.
In other words, dynamic net requests are banned, and remotely-updated blocklists are banned as well.
[1] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...
[2] https://developer.chrome.com/docs/extensions/develop/migrate...
[3] https://developer.chrome.com/docs/webstore/program-policies/...
[1] https://developer.chrome.com/docs/extensions/reference/api/d...
"Based on input from the extension community, we also increased the number of rulesets for declarativeNetRequest, allowing extensions to bundle up to 330,000 static rules and dynamically add a further 30,000."
I like this. or possibly the COM API. but I'm not a Windows expert.
Better to just use a browser that actually respects its users.
Google knows what will likely happen, and pays people lots of money to know.
1. Googler, opinion solely my own.
I am consistently blown away when I inadvertently experience the Internet without ad-blocking. It’s absolute garbage.
I am sad that people are either OK with this or don’t care. For many they don’t know any better, and asking many of those same groups to install and manage plugins is a fraught request.
Chrome's market share is about 65% [2]. If their recent manifest changes eventually break ad blocking (which seems to be the goal), it'll lose a bunch of market share (I guess they're optimizing for short-term profit).
[1] https://backlinko.com/ad-blockers-users [2] https://gs.statcounter.com/browser-market-share
The day Chrome can't sufficiently block ads anymore is the day Chrome dies.
I suspect they have silently stopped blocking ad blockers.
I remember there was a lot of reports about this being the case, but there is no way I am not blocking Google.