Moving from reCAPTCHA to hCaptcha

Moving from reCAPTCHA to hCaptcha(blog.cloudflare.com)

438 points by migueldemoura 6 years ago | 189 comments

zachware 6 years ago |

One of the more insidious elements of ReCAPTCHA is its propensity to challenge users who have robust cookie blocking in place. So as we encourage people to be more privacy-aware, the web gets harder and harder to use.

We've seen ReCAPTCHA pop all over ecommerce, all over benign websites with little to no need to challenge use almost completely because of the increase in privacy-aware users.

ReCAPTCHA essentially flies in the face of the recent blocking features rolling into Safari and Firefox and more privacy-aware users...growing by the day.

In many ways it's a genius structure from Google. 1. Convince people to use your privacy challenge. 2. Serve it when you don't see Google tracking cookies. 3. Offer a way around that with the least privacy-aware browser available (Chrome use is growing steadily month over month.

So good on Cloudflare.

jjoonathan 6 years ago | |

That, and ReCAPTCHA had hellbans.

If you blocked cookies or were otherwise problematic, it would sometimes lock you out of all ReCAPTCHA-gated resources not by giving you a message describing what was happening, why, and how to fix it, but rather by simply pretending that your every attempt to solve the captcha failed. Obviously this is extremely frustrating, by design, but it gets even more so with compounding factors like "the library is closed at this hour, so I can't get a fresh connection."

The worst I've seen has been when it happens to people who aren't well equipped to guess what's happening. When my friend's younger brother got hellbanned from his PlayStation account, he spent 30 minutes trying to identify traffic lights (or whatever) and then retreated crying to his room, because he wasn't able to deduce that Google was gaslighting him. He trusted Google. They had him convinced that he was such a failure he couldn't even identify traffic lights correctly, and he was -- quite reasonably -- inconsolable for a while.

Thanks a lot, Google.

RadiantUnicorn 6 years ago | | |

I don't think I've ever been "hellbanned", but I've certainly spent more than 5 minutes on trying to get a captcha to work.

After a while I usually need to ask friends in the US to help me, because it asks me a non-localized question.

My favourite question was: Select all fire hydrants.

I selected only the classic red one's you see in movies. Fail.

I selected the one's that were yellow too. Fail.

I sent a picture of the grid to a friend. He spotted that some of the pipes on a wall were fire hydrants, which I didn't know. Pass.

In my country we don't have hydrants. We have holes in the ground that are covered by a lid. After removing it you can attach the water hose there.

_jal 6 years ago | | |

Captchas are fundamentally anti-human. I'm not saying there isn't a problem to be solved, I'm saying Captchas are a behavior enforcement mechanism overseen by robots and are anti-human.

I write the site owner short note when they go bad explaining why they just lost a customer and go somewhere else. Life is too short to put up with shitty tech.

gentleman11 6 years ago | | |

I must have been hell banned in the past. It used to take 30 mins to log into humble bundle because of the endless stoplights and sidewalks, I buy a lot fewer bundles now since I’m still a little bitter.

Now I just deliberately give bad answers and get to “pass” the challenges... not sure why

Kalium 6 years ago | | |

How, in your opinion, should Google have handled the matter in a way that does not give spammers or other abusive users ways to get around the measure? Bear in mind that any such approach has to be scalable to many zeros daily, the vast majority of which will not be empathically awful cases like your brother's very real pain and distress - most will be genuinely abusive behavior.

I want to be clear that I am not attempting to minimize your brother's pain or emotional suffering. I'm hoping that there might be an approach that's kinder and more compassionate to him while still accomplishing the same goals.

noad 6 years ago | |

You're forgetting the main benefit for google, which is getting humans to train all their vision models for free. At one point they were just forcing X% of clicks to fill out a captcha regardless of origin or identity just to get more data.

I for one am getting quite tired of trillion dollar corporations getting things for free out of me. Hard pass.

weinzierl 6 years ago | | |

> You're forgetting the main benefit for google, which is getting humans to train all their vision models for free.

Is this still true? I keep seeing the same type of images for years and there might be 7 or 8 different categories but that's it. To me reCaptcha looks like a service well in its maintenance phase. If it was actually in use for training purposes you might expect images to match a wider range of tasks.

grishka 6 years ago | | |

Except in this wonderful new world, you don't get the choice to "hard pass". As someone whose ISP has too few public IP addresses, I see Cloudflare's "one more step" pages at least several times a month. It's terrifying to realize just how much of the internet is behind that thing right now.

Analemma_ 6 years ago | | |

This really shows how popular perceptions of Google have changed for the worse over the years. I remember when RECAPTCHA was first launched, everyone knew right away that it was just helping Google train their vision models, but at the time we all thought it was cool, like "Wow, I'm helping the cause of AI research at the same time as stopping spam". But now it just pisses everyone off.

Hell, for a little while Google had a game (can't remember the name of it) which was labeling images with another person to get points and people loved it.

mikkelam 6 years ago | | |

I really don't think the challenges we're giving at still hard for computers.. a lot of these are super simple.. google would've cracked many of the driving ones years ago

derefr 6 years ago | | |

If that was still the main benefit for them, they wouldn’t be planning to start charging for it, because that would—and, as this article shows, has—cut off much of that data flow, as reCAPTCHA clients abandon the service for another one that isn’t charging them.

0xff00ffee 6 years ago | | |

Did you even RTFA and look at hCAPTCHA? hCAPTCHA couldn't be more grossly focused on neural-net training. Hell, one challenge asks you to draw a bounding box and another is a classification tagging.

Kalium 6 years ago | |

One of the non-obvious consequences is that any system designed to use technical measures to distinguish between humans and computers will wind up very sensitive. There's an arms race, and us real users are caught in the middle.

There's a vast army of computers doing their best to pretend to be human. The whole point of any kind of CAPTCHA is to try to catch them out - and every measure gets worse over time. So companies like Google look at everything they can see that helps them distinguish typical humans from robots.

This has a nasty side-effect. A lot of measures intended to preserve privacy have the incidental effect of making the privacy-sensitive user look more like a computer and less like a human. Not saving cookies and not executing JS are classic bot moves. This plays directly into the sensitivity that has been engineered over time in order to catch more computers posing as humans.

I don't know any easy resolution to this tension. Maybe you do? I really hope so. The internet is overrun with abusive behavior and the amount of work that goes into keeping it at bay is staggering.

GuB-42 6 years ago | |

> One of the more insidious elements of ReCAPTCHA is its propensity to challenge users who have robust cookie blocking in place.

It is understandable and I expect HCAPTCHA to do the same thing. The goal of a CAPTCHA is to identify you as a human. I don't know how ReCAPTCHA works, but I expect it to be like spam filters: they have a sample of bots, a sample of humans and assign weights to every aspect, in the end, the algorithm spits out a probability of you being human, and it will challenge you until it reaches a set value.

The thing is: if you hide everything for privacy reasons, you are making yourself indistinguishable from anything else using HTTP, including bots. That's the point, but it also means the only way to prove you are human is through a challenge.

Think of it like a private club. If you a regular and the bouncer is likely to recognize you and let you in without asking anything. But if you don't want to show your face, you will need to show your membership card every single time. That's the price of anonymity.

andrenotgiant 6 years ago | |

> One of the more insidious elements of ReCAPTCHA is its propensity to challenge users who have robust cookie blocking in place... ...So good on Cloudflare.

Just to be clear: Cloudflare is only changing the _provider_ of CAPTCHA's. They are not changing the _criteria_ for showing CAPTCHA's.

So users who have robust cookie blocking in place will continue to be penalized.

tcd 6 years ago | |

I would love to see the raw data on how many transactions have been abandoned because of ReCaptcha; if I had to solve a test to purchase my shopping, I'd go elsewhere (and there are places that are not as hostile out there).

I cannot understand the stupidity of putting your entire business in the hands of an advertisement company who gives no shits about you as a business or a person, apart from your data.

I can say for certain ReCaptcha has made me reconsider a purchase and is a major factor in my purchasing decision. If I can't use all my privacy tools (including noscript, and I only whitelist a few times to get the right scripts), then I don't care about what you're selling.

Hopefully in the near future ReCaptcha breaks altogether due to enhanced privacy protection.

gsich 6 years ago | |

I use buster to solve recaptcha.

blakesterz 6 years ago |

> "Earlier this year, Google informed us that they were going to begin charging for reCAPTCHA. That is entirely within their right. Cloudflare, given our volume, no doubt imposed significant costs on the reCAPTCHA service, even for Google."

Even in the article they say... "Google provided reCAPTCHA for free in exchange for data from the service being used to train its visual identification systems." ... I thought this was one of those win/win things... Google gets something, websites get something... what's changed? Is Google not getting much out of reCAPTCHA now?

yjftsjthsd-h 6 years ago |

Well. That's probably fantastic news; using ReCAPTCHA (and thereby making users subject to Google's tender mercies) was honestly my main reason to dislike cloudflare from a user's perspective. ReCAPTCHA is utterly foul; it follows you everywhere it can, exists to undermine privacy, punishes non-Chrome users, and throws you in an infinite loop when it decides that you're not a human.

curiousgal 6 years ago | |

I don't blame reCAPTCHA for existing, I blame Cloudfare for using. It made using Tor literally impossible. Hopefully this will be better.

lol768 6 years ago | | |

Didn't Privacy Pass help here?

elric 6 years ago |

It's a start. reCAPTCHA is a notorious pain in the arse for anyone whose browser isn't Chrome and for anyone who doesn't keep cookies. I'm not sure if hCaptcha will be better, but it's hard to imagine it being any worse.

cinbun8 6 years ago |

> Earlier this year, Google informed us that they were going to begin charging for reCAPTCHA

So it came down to cost.

> Over the years, the privacy and blocking concerns were enough to cause us to think about switching from reCAPTCHA. But, like most technology companies, it was difficult to prioritize removing something that was largely working instead of brand new features and functionality for our customers.

I like that they're upfront about this. In most companies / teams of this size, these issues are always swept under the carpet until something ugly forces you to clean up at a later point in time. It's just unavoidable.

alexnewman 6 years ago |

Hey everyone. HCaptcha founder here. We are so happy to be on hackernews. I'm curious if anyone is having any problems? We are trying hard to respond carefully to customer requests but as you can guess we are very busy. Also we are hiring :)

ronyfadel 6 years ago | |

Hey Alex, one suggestion, the HCaptcha challenge box is way to tall, sitting at 725px, it's larger than the chrome viewport on a 13" MBP, so I have to keep on scrolling up and down to solve the captcha.

alexnewman 6 years ago | | |

You would not believe how much we think about these things. We appreciate the feedback and will continue to tune for every puzzle. Thank you so much for the feedback.

aeonflux 6 years ago |

This is what I recently got on CF's HCAPTCHA (look closely): https://imgur.com/a/QZNHmUC

alberth 6 years ago | |

I see 2 clear images of dogs. 2 possible dog images. And zebras humping.

Nice.

chrismorgan 6 years ago |

A few days ago I encountered this when Cloudflare decided my IP address (which is behind an ISP-level NAT) was suspicious all of a sudden (which it hadn’t been doing, a pleasant change from when I was at this location three years ago when half the internet sprouted Cloudflare CAPTCHAs at me). It was awful to solve, worse than the substantial majority of reCAPTCHA checks I’ve encountered. Certainly nothing like the illustrations in the article.

IAmEveryone 6 years ago | |

I had the same experience. But this may just be an artefact of humanity now having been trained exceptionally well to identify traffic lights and busses, but being relative novices at identifying elephants.

And now I'm wondering if this may not be a spectacularly useful tool to raise standards of education world-wide. Imagine, say, the French government buying them and asking every person on the internet twice a day to match some vocabulary to images: Identify "le baguette"! Lingua Franca, le sequel.

Or a maps puzzle: "Please identify Equatorial Guinea, Papua New Guinea, and Guinea-Bissau".

hbvvvvgff 6 years ago | |

I tried a hcaptcha and it was way harder to solve than the usual recaptcha. However, It was significantly easier than the recaptchas you get when using tor.

KCUOJJQJ 6 years ago |

I just tried it on a website that uses Cloudflare and that always asks me to solve a captcha. (I guess this website does this if the user has a foreign IP address.) In the past I managed to get the non-script Recaptcha. But I don't see a non-script Hcaptcha. I'm a little afraid of possible browser fingerprinting scripts. If there was an unwaivable, enforced right to privacy I wouldn't be afraid.

Also, I don't want to solve any script captchas anymore because of a traumatic experience with script Recaptcha. I had a portable Chromium with login cookies for a few websites. I didn't use that Chromium for other websites than these few. Suddenly, one service almost always demanded a new login after just 1 day. On each login I had to solve a script Recaptcha. I didn't find a way to get non-script Recaptcha. According to the service evil spambots had attacked it. Once, Recaptcha let me solve captchas for minutes, just to eventually tell me I was a bot. I had an IP of a large internet provider. I deleted cookies, got a VPN IP, tried it again, worked on the captchas in the exact same way as before and managed to log in to my account. A website operator wrote in a forum thread that Recaptcha was the only solution to the bot problem. One user suggested "email login as an optional alternative". This was not implemented, because apparently Recaptcha was really specifically the only solution. I then switched to another service, which cost me a few hours of work. This traumatic experience has made me completely unwilling to solve any script captcha.

worble 6 years ago |

A little off-topic, but the article mentions they support Privacy Pass. I remember seeing the announcement a little ways back when they first released it but just kind of forgot about it. Is anyone using the browser extensions? Has it reduced the amount of captchas you end up seeing, or made your browsing experience better in any way?

devy 6 years ago |

The enterprise grade hCaptcha[1] is not free either. Does anyone have pricing information?

[1]: https://www.hcaptcha.com/#plans

wongarsu 6 years ago | |

According to the article Cloudfront is paying, but is paying "a fraction of what reCAPTCHA would have [cost]". Recaptcha is $1/1000 challenges, so apparently hcaptcha is some small fraction of that.

Cloudfront might get a discount for running some of the infrastructure on their own servers, on the other hand that might also be an integration hassle that actually costs them money.

meowface 6 years ago | | |

> Recaptcha is $1/1000 challenges

This seems unwise, because many captcha farms charge less than this. A quick Google search shows one service offering $0.50/1000 challenges. If it's 2x cheaper for an attacker to solve a captcha than it is for a provider to display it, it sounds like the attackers win.

JakeTheAndroid 6 years ago | | |

*Cloudflare

Macha 6 years ago | |

It sounds like Cloudflare is paying at least partially in free/discounted Cloudflare services.

StavrosK 6 years ago | |

It says it's free for non-enterprises .

yjftsjthsd-h 6 years ago | | |

Okay, but Cloudflare is very much an enterprise, and lots of people here are working in such places, so it's a decent point.

cm2187 6 years ago |

> But, sometimes, when we're not 100% sure if something is malicious or good we issue it a “challenge”.

I think they meant “bot or human”, not “malicious or good”. Bot != malicious. And these challenges will do no good to non malicious bots.

lucideer 6 years ago | |

I think you're confusing intent with implementation.

You're right that the implementation excludes non-malicious bots and fails to solve for malicious humans, but that just makes it an imperfect implementation of the intent: which is to differentiate malicious & good.

_nickwhite 6 years ago |

From the article:

"We evaluated a number of CAPTCHA vendors as well as building a system ourselves."

and

"We worked with hCAPTCHA in two ways. First, we are in the process of leveraging our Workers platform to bear much of the technical load of the CAPTCHAs and, in doing so, reduce their costs. And, second, we proposed that rather than them paying us we pay them. This ensured they had the resources to scale their service to meet our needs. While that has imposed some additional costs, those costs were a fraction of what reCAPTCHA would have. And, in exchange, we have a much more flexible CAPTCHA platform and a much more responsive team."

So Cloudflare are basically cloud hosting hCAPTCHA's services. I wonder why Cloudflare didn't just buy them, as it seems like it would be a win-win with getting an excellent CAPTCHA service, and not have to build it themselves?

IAmEveryone 6 years ago | |

CF likes the CAPTCHA part of CAPTCHAS, but any vendor is probably far more invested in the "generating ML training data" scheme.

CF probably has zero interest in that part of the product: It doesn't fit with their existing products nor customers, and it's just too small relative to their other business to devote much attention to it.

At the same time, the business opportunity is probably too large for hCAPTCHA's founders to just forget about it, or for CF to compensate them on the hot-new-technology assumption when they're only looking for peace-of-mind-utility tech.

dathinab 6 years ago | |

At the end they mention that there long term goal is to eliminate captchas fully if possible.

beojan 6 years ago | |

I suspect that might happen eventually.

noncoml 6 years ago |

IMHO CPATCHA is a lazy way to protect your service as you shift the burden to your users.

Maybe if you are big and essential for some users, you can afford that. But if not, be aware that users will turn their back on you if you add obstacles between them and your service.

Edit: meant to say “be aware that some users will turn their back to you”

Legogris 6 years ago |

Apart from the surveillance aspect, one thing that bothered the hell out of me with Cloudflare using ReCAPTCHA was that it yielded a much larger part of the web than necessary effectively blocked in China, since the CAPTCHAs would get triggered, and not load, from Chinese IPs.

I had a customer where we had to migrate away from Cloudflare for this reason - this was about 5 years ago and the issue has been there to this day. Glad to hear they've finally done something about it. Even if it took Google starting to charge money for ReCAPCHA to trigger it.

jasonhansel 6 years ago |

Has anyone else seen reCAPTCHA getting way more difficult of late? It often takes me a full minute to find all of the tiny traffic lights hidden away in a set of low-quality images.

ship_it 6 years ago | |

Just use Buster[1]

[1] https://chrome.google.com/webstore/detail/buster-captcha-sol...

drusepth 6 years ago | | |

Worth noting that it's possible to get a hellban if you get too many wrong guesses using extensions like Buster.

notechback 6 years ago |

This sticks out to me:

> We also had issues in some regions, such as China, where Google's services are intermittently blocked. China alone accounts for 25 percent of all Internet users. Given that some subset of those could not access Cloudflare's customers if they triggered a CAPTCHA was always concerning to us.

They are explicitly saying that China's blackmailing of Google is working so well it even affects decisions on using Google products outside of China.

I'm not a Google fan and think this move is a great improvement for the web and user privacy, but that this was explicitly motivated by China's blackmailing tactics is terrifying.

And we can from this post even make another case that also doesn't paint a nice picture: Cloudflare does not care enough about 25% of internet users to move away from reCAPTCHA - until it affects their bottom line in a visible and immediate way.

kevindong 6 years ago |

There are plenty of services that will happily accept a screenshot from a developer, send it out to live humans who solve it in real time, and then return the answers to the developer.

I'm not going to link to them, but you can find them yourself by googling "buy recaptcha solver". The prices for the top two results are $0.50 and $1.39 per 1000 solves (respectively, $0.0005 and $0.00139 per solve).

At that price point, it's feasible for the truly determined to just use those solvers to bypass ReCAPTCHA (or similar services).

xur17 6 years ago | |

Are there chrome extensions that I can use these with? I'd be willing to pay those rates to never have to solve a captcha again. I'm fine leaving the tab open for a few minutes while it's solved even.

alexnewman 6 years ago | |

HCaptcha's enterprise solution is designed to detect this threat and other's.

kennydude 6 years ago |

hCAPTCHA looks interesting, although it seems they use Blockchain for no real reason compared to just storing the payments as rows (i.e what they gain from being chained on top of another)

colejohnson66 6 years ago | |

The point of a blockchain is that to edit an earlier record, you would need to edit every record that comes after (due to storing a hash of the previous block in the current block). However, it doesn’t make sense when one entity controls the entire system because if a hacker (or even an insider) can change one record, they could change all of them. Hence why a good blockchain would be distributed. Then, if one node edits the history, the other nodes will see the anomaly and ignore that node.

This is also why Git’s history is easy to edit when it’s only on your machine. But once you push to GitHub and others clone your repo, it becomes a lot harder to edit history. Yes, Git isn’t a blockchain, but it does use the idea of hashing the previous “block” (commit) and storing it in the current “block.”

wongarsu 6 years ago | | |

However you can do a local blockchain (or hash chain, or whatever you want to call it) and distribute just the hashes. If you have a local git repo and regularly tell me your commit IDs I can testify that the code existed at that point in time, and can later verify it wasn't changed if you choose to expose the full commit to me. And because it's a chain, you only need to communicate one commit ID for every external timestamp you care about, not for every commit you care about.

speedgoose 6 years ago | | |

Yes if do not you want to distribute your data with random people over the internet, you need a Merkle tree. Not a stupid blockchain with all the downsides a blockchain have.

kennydude 6 years ago | | |

Yup, that's my thing is that they control the entire thing. Although it could be like joke where "AI" ends up being just a bunch of "if statements".

splintercell 6 years ago | |

It seems like they bootstrapped themselves using an ICO.

datafix 6 years ago |

Hey, I interviewed with them a year ago. Their captchas are actually harder than reCaptcha's.

shp0ngle 6 years ago |

> Earlier this year, Google informed us that they were going to begin charging for reCAPTCHA.

Wait. Is this news? I don’t see other article about this. What is the pricing?

synsynack 6 years ago |

It's not worth a rich person's time to solve captchas, while it is for a poor person. This has lead to captcha solving services, extensions plugins, etc, all which have high latency delay, not over a fast documented API. It would be 100 times easier if cloudfare/google let's you directly buy credits, at the mid-point price between current bid-ask spread, of say 50 cents per 1000 captchas, which would probably last you a few months to a year.

jccalhoun 6 years ago |

I've ran into hCaptcha a couple times recently and found it vague and I had to try to guess what they meant. Both times it asked me to identify the truck. Well, what do you mean by "truck?" are you counting a semi as a truck? I ended up having to do it twice because I don't consider a semi a "truck" but they did.

Keverw 6 years ago | |

Interesting, I know some people consider a Truck a semi but your pick up truck isn't really a truck according to others. So confusing with all the different definitions.

aaron695 6 years ago |

So no one can turn free human labour into enough money to pay hosting fees?

And given spammers a lot of the time are messing with Google, it's also in Google's interest to do this for free!

What are they thinking? Is this one department make $100 internally while killing $1000 in another internal department?

TechBro8615 6 years ago |

This is fantastic news for privacy on the web. Thank you Cloudflare!

I’ve been seeing hcaptcha in more and more places recently. It’s a bit rough around the edges still, but it works well and feels far less hostile than recaptcha.

paulie_a 6 years ago |

The funny thing is that Google doesn't even use recaptcha and instead use some awkward hard to read piece of shit. After 4-5 guesses, and they are guesses you might proceed.

rstupek 6 years ago |

Did anyone notice that hcaptcha runs on top of etherium?

spsrich2 6 years ago |

I hate Hcaptcha. It keeps presenting the same challenge over and over again. Everytime I need to access a site it protects it wastes so much time.

realtalk_sp 6 years ago |

Are people here not aware of reCAPTCHA v3? It doesn't involve user interaction. I just integrated it into a site. Works well.

foob4r 6 years ago |

How good or bad is the new system on tor? ReCAPTCHA straight up was a 0/10 over tor for me.

outloudvi 6 years ago |

1. I think challenges from hCAPTCHA is harder than reCAPTCHA. It's far and even further from human-friendly compared to reCAPTCHA.

2. hCAPTCHA seems to be using the similar revenue model as early stage reCAPTCHA and it even pay its users. I doubt that its model is sustainable.

3. A huge company like Google may not be able to handle user data well, so a small company will be able to?

garaetjjte 6 years ago |

Can we get back text based captchas instead of annoying whack-a-mole photo picking?

theandrewbailey 6 years ago | |

No. Photos of street things are much easier to pick out than warped or miscolored text.

hombre_fatal 6 years ago | | |

Especially the amount of warping you apparently need to do to text to make it hard for a neural network these days.

maallooc 6 years ago |

I hope this captcha is tor friendly.

blackdogie 6 years ago |

One look at what cookie domain (google.com) recaptcha runs on will give you a hint to its usefulness.

tcd 6 years ago |

It's funny that we need to ensure humans are the ones performing certain actions like making a purchase or accessing a service, but we let machines make decisions over very important matters in our lives (credit/financial decisions).

It's intriguing they said Google will charge for reCaptcha, any information on that? I can't imagine all the small business owners will have to start paying, but perhaps if they did they'd just remove it altogether (a net win!).