CAPTCHAs: 'a tracking cookie farm for profit masquerading as a security service'

CAPTCHAs: 'a tracking cookie farm for profit masquerading as a security service'(pcgamer.com)

198 points by ghuroo1 1 year ago | 133 comments

jp191919 1 year ago |

I'm at the point now that if I get a CAPTCHA, I'm just going to leave the site. I'll spend my money elsewhere or find an alternative

a2128 1 year ago | |

My government's websites require solving a reCAPTCHA for basic services, which is horrifying. They also use Cloudflare which blocks me sometimes. This is in the EU

phoronixrly 1 year ago | | |

Confirming this. I am also completely certain that gratuitous CAPTCHA use is banned for government systems by my country's set of laws governing their implementation. The judicial system and the community have not matured enough to consider this a breach of law worthy of fighting against...

openplatypus 1 year ago | | |

Name and shame, please!!

ReCAPTCHA due lack of opt out is effectively illegal in the EU.

bmacho 1 year ago | | |

Where in the EU? Maybe you can file a GDPR complaint

cyberax 1 year ago | |

This automatically means that you're penalizing smaller websites. And killing off the independent alternatives to Reddit/Disqus. Do you want this?

Large sites like Amazon or CNN can afford to eat the bot traffic. Smaller sites can't.

cryptoegorophy 1 year ago | | |

Problem isn’t a bot traffic. I run an Ecommerce site and scammers run python scripts to test 1000s of cards per hour if there is no captcha. I hate it, my customers hate it, scammers hate it, but it is the only thing that keeps my merchant account running. Any advise is welcome!

Zak 1 year ago | | |

> killing off the independent alternatives to Reddit/Disqus

I haven't encountered a captcha using Lemmy. There might be one on some servers for account creation.

KennyBlanken 1 year ago | | |

What are you on about?

I've used Amazon from the same IP address for years and I still regularly get the "you look like a bot, solve this" crap.

mouse_ 1 year ago | | |

Did you read the article? What you said directly goes against the study's conclusion.

noah_buddy 1 year ago | | |

Sounds a heck of a lot like the bots are killing off these websites. Gross overuse of automated scraping is a fact of life but individual choice is intolerable. What if I told you they were the same thing?

Dotnaught 1 year ago |

Google addressed the claims in this paper last year, and one of the authors challenged the company's responses. See: https://www.theregister.com/2024/07/24/googles_recaptchav2_l...

kevin_thibedeau 1 year ago | |

As of two weeks ago my locked down Firefox profile gets hit with captchas on every visit to Google search. DDG has also gone to shit with captchas and stupid low cache lifetime because I use their non-javascript site. I'm giving Bing a test run before making the leap to Kagi.

yegg 1 year ago | | |

We (at DuckDuckGo) shouldn’t have a lot of captchas and when we do (intended to keep away non-human traffic) they are completely anonymous, self-hosted, and not related to any AI or other machine learning. As such, I’d love to figure out why you are getting ensnared in them when you aren’t supposed to. If you want to reach out via email (see my profile) I will look into it.

EVa5I7bHFq9mnYK 1 year ago | | |

Try also Startpage, it doesn't give me any captchas even though I am a career criminal: guilty of adblocking under Firefox influence, while commiting a VPN. They also have a nice Anonymous view.

vitehozonage 1 year ago | | |

You might want to try Mullvad Leta, it's what i use for this issue. I would try Kagi if it could be used privately but i suppose it still requires an account and has no way to pay privately

eykanal 1 year ago |

The problem with this paper is that, while technically true, there are many website owners who have found that CAPTCHAs have effectively reduced the spam on their site to zero. The fact that a CAPTCHA _can_ be bypassed doesn't mean that it _will_, and most spam bots are not using cutting-edge tech because that's expensive.

To say "it's worthless from a security perspective" is a pretty harsh and largely inaccurate representation. It's been tremendously useful to those who have used it. If it wasn't valuable, it wouldn't be so widely used.

Definitely agree with the whole "tons of free $$$ for Google", but that's kind of their business model, so yeah, Google is being Google. In other breaking news, water is still wet.

btown 1 year ago |

The "cookie farm for profit" point is worth elaborating on. From the original paper https://arxiv.org/pdf/2311.10911 :

> More concretely, the current average value life-time of a cookie is €2.52 or $2.7 [58]. Given that there have been at least 329 billion reCAPTCHAv2 sessions, which created tracking cookies, that would put the estimated value of those cookies at $888 billion dollars.

The cited paper is https://www.sciencedirect.com/science/article/pii/S016781162... - but it doesn't deal with CAPTCHAs, just with the general economics of third-party cookies.

In practice, many of these cookies will have already been placed by other Google services on the site in question, with how ubiquitous Google's ad and analytics products are. And it's unclear whether Google uses the _GRECAPTCHA cookies for purposes other than the CAPTCHA itself (in the places where this isn't regulated).

But reCAPTCHA does gives Google an ability to have scripts running that fundamentally can't be ad-blocked without breaking site functionality, and it's an effective foot in the door if Google ever wanted to use it more broadly. It's absolutely something to be aware of.

ghuroo1 1 year ago |

That made us spend 819 million hours clicking on traffic lights to generate nearly $1 trillion for Google.

voisin 1 year ago | |

At an approx 750,000 hours in a human lifespan, they wasted 1100 human lives in totality. Unbelievable.

thechao 1 year ago | | |

There's a dystopian short story in your comment about AI that can't self-bootstrap without ground-truth from humans, so they keep us around just to mark images, music, etc. Lives wasted annotating things. I like to think they'd drag us from solar system to solar system for this purpose.

extraduder_ire 1 year ago | |

Does solving captchas generate $1000/hour? I assume you're conflating amounts here, or messed up an order of magnitude somewhere.

rozab 1 year ago | | |

That's just the headline of the article.

The researchers put the vast majority of this value to tracking cookies, and this revenue happens whether or not a manual challenge is completed.

scubadude 1 year ago | |

That's just the direct value, how about the whole dimension of training the AI models

ChrisArchitect 1 year ago |

[dupe] Earlier: https://news.ycombinator.com/item?id=42997755

https://news.ycombinator.com/item?id=42970780

breppp 1 year ago |

I get that people are here to hate on Google, but I am just here to say that reCAPTCHA albeit acquired, is an absolutely brilliant idea. The kind that solves two (three? if you count tracking) problems so elegantly

phoronixrly 1 year ago | |

Absolutely agreed on the 'very elegant solution for global-scale tracking' part!

extraduder_ire 1 year ago | |

The people who created the initial version that got bought went on to create duolingo, with a similar goal of getting people to produce translations of text.

therein 1 year ago | |

Multi-purpose trojan horse. Not only will it look beautiful in your city but you can use it as scaffolding to repair tall buildings or children in your community could use it as a play gym.

darkwater 1 year ago |

Naive question: how can clicking on the motorbike or traffic light image help to train an ML algorithm if they already know what image has a motorbike in it, or otherwise the captcha would not make sense. Maybe they put 3 image which are already with a score of >0.90 and one which is just 0.40?

mbb70 1 year ago | |

Yes, known images are used for validation, unknown images are used for training.

inetknght 1 year ago | |

> Naive question: how can clicking on the motorbike or traffic light image help to train an ML algorithm if they already know what image has a motorbike in it, or otherwise the captcha would not make sense.

It's more than just your answers that are fed into ML and more than just what others have already said: there's also the way that your browser functions and the way you interact with it. Your IP address, browser, OS, screen size, input type, timezone and current time of day, how fast do you select different images, etc etc. All of this gets fed into ML algorithms and answers to the obvious images are used as corollaries to support/deny your ancillary information.

michaelt 1 year ago | |

Hypothetically speaking, if they've got a 97% good ML model, they could implement a captcha where if you disagree with their model you have to do a second image, and a third image and so on. Then they could show each image to several different humans, and only if a bunch of people disagree with the model do they take a closer look.

Frankly a lot of the images I get are... kinda easy? This isn't the classic book-reading recaptcha where you could see why the text had confused the OCR.

hyperman1 1 year ago | |

They ask the same question to multiple people. Whatever the majority answers is right.

woleium 1 year ago | |

they ask you to solve two. one they know, the other they don’t

DougN7 1 year ago | | |

I’m not sure. If I don’t click on one that is a bus it won’t let me forward. It’s not like I click an “Ok, I’m done” button. I guess we could all delay clicking and maybe it would give up and assume the unknown bus wasn’t really a bus after all?

pupppet 1 year ago |

What's the alternative?

unethical_ban 1 year ago |

What proof of humanity is sufficient? Today it is a phone call, or a verification sent to a real address (limit one registration per household), or a video call. How will we verify humanity in 20 years when audio and video emulation is foolproof?

We'll have to have in-person attestation or make all services paid, perhaps.

phoronixrly 1 year ago | |

I would wager all services will be linked to a verified credit or debit (non-temporary) card. Most of them are now...

How are you going to connect the physical person with an identity with in-person attestation? Many (several of which major English-speaking) countries don't have mandatory government IDs...

A commenter below suggests that government eIDs could be used. I bet this will be harder to implement and will have much worse conversion rates than (the already terrible) mandatory credit/debit cards... Not to mention the hell that we as non-US citizens will have to endure if anyone tries to impose any form of mandatory ID there... One can only take so much complaining about government overreach about something that is basic necessity here in the EU...

thatguy0900 1 year ago | |

Realistically it will be a government or private service that everyone will have to have to verify that it is a real person. Or at least tied to a real person so that banning will be more sticky.

apitman 1 year ago | | |

Like Google

nervysnail 1 year ago | |

Nothing less than drinking a "verification can", as presciently propounded by a 4chan user a decade ago.

kykeonaut 1 year ago |

Wouldn't some sort of proof of work be a good solution to the captcha problem?

Specially since all of the sudden, a bot service running hundreds of thousands of requests will suddenly and inadvertedly have to compute cryptographic hashes at the cost of the user running the bots?

theamk 1 year ago | |

no, because a lot of bot service run on botnets, made out of hacked regular residential computers, routers and so on. They will feel a bit more sluggish, but it won't cost the botnet authors that much more.

On the other side, an amount of work reasonable for modern desktop will absolutely overwhelm an older cell phone.

jvdvegt 1 year ago |

To prevent the cookie wall with no 'reject all': https://archive.is/oHc1e

nonrandomstring 1 year ago |

You can get people to do almost anything if you lie to them that it's for "security".

nonrandomstring 1 year ago | |

Literally. Social engineering 101. Grab a clipboard, put on a hi-viz, speak in an authoritive, directive voice and people will absolutely do what you ask at the sound of the word "security". Social engineering defence 101: teach scepticism and not being intimidated by the word "security"... ask "whose security?", "security from what?", "security to what end?", and "show me your ID and the written policy".

catlikesshrimp 1 year ago | |

Except captcha is not supposed to be security for the user, but security for the website.

But in the end it is not (effective) security for a website, is an antifeature for users and is profit for google.

jisnsm 1 year ago | | |

As a website developer and host, I can assure you recaptcha works very well to stop spam and automated login requests. It is not perfect, but no system is.

jdietrich 1 year ago |

loloquwowndueo 1 year ago | |

It’s not three years, it’s thirteen.

> A lifetime value of $888 billion for all of reCAPTCHAv2's tracking cookies produced between 2010 and 2023.

phoronixrly 1 year ago | | |

jdietrich, I feel your pain, I am also completely convinced that 2010 was 3 years ago :(

bigbuppo 1 year ago |

819 million hours of unpaid labor. And just think, a large chunk of that was performed by children. CAPTCHAs are slave labor in small doses. It's also a way of avoiding paying taxes on that labor. But hey, what's a few billion dollars in unpaid taxes and unpaid wages and child labor violations between friends?