hahaha. What a coincidence, Google!
So you got a hold of the neural hashes, and then used an error function and descent to generate images that match a 'hash'?
It feels wrong to call them 'hashes' when they're so weak to pre-image attacks. They're not the same idea as cryptographic hashes at all.
Also want to underline how spooky it is that some of them do resemble human forms.
First is veg&butthole, then boobs, next is doggy style etc etc (edit: it seems the order isn't consistent. So I'm likely seeing different images then you.)
You can go through them all and see the original pornography if you look at the shapes. To me, it looks more like they started with the real images and tweaked them to make them artsy.
These images could be a joke, as I don't think we have a clear technical documentation of how these hashes are generated. Computer vision? Vectors? Face recognition software? It's definitely not a naive hash.
Edit: seeing the other comments in this thread referencing Twitter, it looks like it's more naive than expected, as the hash is resistant to resizing, but not to cropping. The implementation can change at Apple's discretion, though.
There is a reason cryptographic hashes are distinguished; some applications of hashing are only concerned with minimizing non-malicious collisions.
(Arguably, this is an application where malicious collisions are an issue, but perceptual hashes don't purport to be cryptographic.)
Long before they claimed that they scan email for child porn, they made an email scanner to appease China, on a condition they will not target dissidents.
I think all remember how it went. Seeing their intimidation work, it only fired up the Chinese government, and led them to only increase their attempts at arm twisting, until Google clumsily pretended to "be tough" while still doing their last attempt behind the scenes negotiations, which, to their big surprise, got them banned overnight.
Even though Google could have made a ton more money by helping China to build the tools of repression.
Don't forget that other large US corporations like Microsoft, Apple and Activision do build censorship tools and participate in repressing dissent.
Not sure exactly how you'd go about doing it, but it seems like there might be a process for 'evening out' areas into solid color that maintains the hash? In which case you're running extensive image processing on illegal images and making variations from those very images.
More info on how this is done?
I would assume these were engineered by getting the perceptual hash valies, using distance from the hash values in the DB as an error function, and starting with an innocuous image and hash value, and iterating to a collision for each.
I'm sure not interested in proving they can. Mind furnishing the info about how it's really done, then? Since according to you (for very obvious reasons) you can never compare these images to the source for the hashes, where did you extract the hashes from?
If you can so easily reverse engineer false positives from random data without ever seeing or using genuine porn to produce it, shouldn't you be disseminating this content as widely as you possibly can, rather than warning people about the danger of interacting with these false-positive images?
Still puzzled how and why this is being done. Are you trying to render Apple's system useless, or not?
Summary: I'm saying "there may be a way to take existing images that are illegal even to possess, and process them to obliterate the image while maintaining the hash. Is that what's being done here?" and the response is "AM NOT!!"
Is there an archived link?
Edit: I guess this? https://gist.github.com/unrealwill/c480371c3a4bf3abb29856c29...
This is not true. They may match the hash, but the will not match the visual derivative.
The system is not as easily fooled as you think.
I would like to believe that is true, but the negative consequences of even generating a false-positive is enough to not attempt to upload any image.
The database of 200_000 images used by Apple (and others?) is private, and I did not found any trace of the hashes (but I could made a mistake here). So, how do you know that those correspond exactly (or with a certain threshold that has NOT been disclaimed by Apple) to the CSAM DB?
Also, NeuralHash has NOT been released by Apple yet (https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...), so...
They way I see it, this is the best photo backup approach one can possibly take. Just get flagged for child porn, and have all your iPhone photos stored indefinitely on FBI servers.
Does the FBI have geo-redundancy?
Assuming the images do as claimed match the hash, they must also match the ‘visual derivative’ in order to trigger a match.
The system isn’t as easily fooled as is being claimed here.
The NeuralHash is what matters, solely.
[1] https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...
Thanks for the heads up.
The current practice is that Apple, Google, Microsoft, etc scan the content of your cloud storage.
The scenario that you described is a risk and has been since cloud providers started scanning 10-15 years ago. Some large companies scan their file servers as well.
Both must match to cause a positive.
These images may match the neuralshash, although we have no proof of that at all. They will not also match the visual derivative.
This whole post is based on incomplete information.
Google has been recently focused on cultivating the image that they care about user privacy. The last thing they want to do is call the cops on a bunch of HN users for looking at some abstract swirly pics.
The images shown do appear to be adversarially generated inputs against some NN-based image hash or classifier, but there is no evidence to suggest that this is at all related to Apple's NeuralHash, or that the colliding hashes are from a real CSAM database (the target hashes are not public).
OP claimed they would "release 5 pieces of proof in the next 5 days" [1], and guess what, 11 days later they still haven't.
Look at OP's post and comment history, it's quite clear that they are a troll.
In the mean time, it has been actually proven that hash collisions against NeuralHash are trivially possible, see [2]
EDIT: Oh, I mixed up tabs. This is a link to a google drive of pictures. Because I have scripts disabled, I got no thumbnails, and I'm thinking since this was flagged, maybe I really don't want to get any thumbnails.
/r/Apple talks about this topic a lot and, similar to HN, is not happy about it. This drive link brings very little additional light to what was already known and discussed.
It’s certainly not correct.
The "perceptual hash" should be able to say "no, that's still the same image" while the file data has been entirely transformed.
On what basis is a set of forest-like and post-alien-invasion and post-apocalyptic abstract art is going to get flagged (my poor eyes see one or two that could have some symbolism)?
There really is a Simpson’s quote for everything.
Then the list gets more accurate and we move on.
Which I point to the decision they came inadvertently. It was their intention to play games with the regime which backfired on them, not vice versa.
This thread was good https://twitter.com/fayfiftynine/status/1427900272148246530
Be specific, because I cannot find it.
I’ve heard from googlers that once an account is nuked for suspected child abuse no one will ever want to touch it to find out whether the ban was legitimate.
Apple said that the risk of collision is "1 in one trillion" which for a hash function would be terrible. We also don't know what the one trillion images they tested against were. If you upload your regular porn to iCloud, it's likely that pornographic images will raise more false positives than say, pictures of sunsets.
> As the system is initially deployed, we do not assume the 3 in 100M image-level false positive rate we mea- sured in our empirical assessment
The "1 in 1 trillion" part is the probability that the number of false positives could exceed the threshold needed to trigger a human review:
> Apple always chooses the match threshold such that the possibility of any given account being flagged incorrectly is lower than one in one trillion, under a very conservative assumption of the NeuralHash false positive rate in the field.
source: https://www.apple.com/child-safety/pdf/Security_Threat_Model..., page 10
As in, pursue a mechanism to get these onto somebody's computer in a way that they'll be backed up via iCloud (for instance, if a person's got their email account including trash folder backed up in iCloud, and you send them the pictures which they 'throw away' because it means nothing to them, placing the images in a trash folder in the mail preferences)
Is that (a) practical and (b) the intent of this exercise? Seeing as every question I've had here has led to karma burning I figured I'd double down and ask if the person doing this is trying to prepare a weapon for swatting people. There are times I respond to downvoting pressure to 'stop talking!' by getting more interested, which I'm sure is a common reaction among some hackers.
I don't remember that ever being a popular take on reddit.
But still, how do you know it's "the same people"? There are a lot of users who hold all kind of opinions.
How can you be certain, and what prevents a generated image from matching both?
What prevents a generated image from matching both is that the attacker would need to know what the image they are trying to spoof looks like, in order to make a false positive of both. I.e. the attacker would need a copy of the original CSAM, and the spoofed file would end up looking like it could be at least plausibly mistaken for that exact image.
My challenge to you is this: what stops this system from being abused for non child pornography purposes?
The answer is: nothing. That's what has people's knockers in a twist. It is a backdoor, invisibly crafted, waiting to be subverted by an abusive power that manages to get into an advantageous enough position.
Arguing that Apple's algorithms are fine misses the point. The behavior should not exist.
Based on the documentation from Apple, they are waiting to get *several* matches, *not only one* (we don't know what is *several* but I don't expect something like <= 3 pictures). Once the rate has been reached, they ask to a physical team to review the "positive matches", and deliberate if, yes or no, the images are CSAM or not.
If yes, after the manual process, the authorities are called.
it worked that time...
For what it’s worth, the null hypothesis is that they are just fakes and the commenter is at best trying to illustrate a point.
No, that's not “the null hypothesis”. It is a positive claim.
The poster is making a positive claim without evidence. Indeed the claim is unverifiable.
Reasonable priors lead fo a null hypothesis that they are at least simply mistaken.
This is without even taking into account other indicators of credibility or authority, or perverse incentives, as priors.
This is a rational use of ‘null hypothesis’, but it also matches the scientific use, which would be that the claim is spurious unless experiment shows otherwise.
In any case, we know that the poster is in fact wrong in their claim.
I mean, if you're calling someone out, at least provide some evidence yourself. Short of a reproducible outcome, you're just as questionable in conclusion as the poster.
The poster’s claim is false based on what they have said.
> you're just as questionable in conclusion as the poster.
Not correct. You don’t need evidence to disprove a claim that is logically false. The poster’s claim is logically false.
Here is a copy of the explanation I gave elsewhere:
—-
I can be certain because I have looked at the images, and they are obviously not CSAM. Since the visual derivative is generated from CSAM, any spoof must look like it could be mistaken at a glance for CSAM.
What prevents a generated image from matching both is that the attacker would need to know what the image they are trying to spoof looks like, in order to make a false positive of both. I.e. the attacker would need a copy of the original CSAM, and the spoofed file would end up looking like it could be at least plausibly mistaken for that exact image.
You are changing the subject. That challenge has nothing at all to do with the OP’s false claims. They are still false.
Someone who can poison the database can indeed match non-child abuse images. The safeguard against that is that both Apple and NCMEC would need to conspire. This mechanism does not prevent such a conspiracy.
> Arguing that Apple's algorithms are fine misses the point.
Who is arguing that they are ‘fine’? I’m simply pointing out that they are not vulnerable in the way the poster claims them to be.
The images they have posted will not trigger the system.
If you want to debate the ethics of other aspects of what Apple is doing, there are plenty of threads elsewhere. This thread is about a false claim about a vulnerability in the system.
False claims about the vulnerabilities don’t help us to reason about what the risks actually are and detract from the moral or ethical debate.
No, its not.
> The poster is making a positive claim without evidence.
True.
That doesn't make the alternate positive claim you have posited into “the null hypothesis”.
A null hypothesis is null. What you are stating may be your prior, but it is not the, or even a valid, null hypothesis.
> This is a rational use of ‘null hypothesis’, but it also matches the scientific use,
“Null hypothesis” is a very specific scientific term of artz it has no other meaning.
And, no, the specific counternarrative presented here does not match the scientific use of “null hypothesis”.
All images that do not actually trigger detection are fakes in terms of the poster’s claim.
That’s not my prior. It is the null hypothesis for any set of randomly selected images.
The poster’s claim is that the images have a special property. That is the positive claim which they failed to provide evidence for, and is logically false based on their description of the method.
Isnt this making the relatively huge assumption that humans and Apple's algorithms have the exact some opinion of what something "looks like"?
No. The visual derivative is designed to be matchable by human inspection.
Even if that was not true, which it is, the poster’s claim would still be false, since the poster doesn’t have access to the source CSAM and therefore would not be able to produce the visual derivative regardless of whether I could visually inspect it.
Being able to see by inspection that they images don’t match CSAM is one of two independent ways in which the claim can be shown to be false.