People obviously still see value in discussing it
Google Photos in 2015: https://www.wired.com/story/when-it-comes-to-gorillas-google...
Flickr in 2015: https://www.independent.co.uk/life-style/gadgets-and-tech/ne...
Facebook, like a lot of tech companies, has long had problems with diversity in engineering. Here's an article from April that discusses specific incidents and the broader background: https://www.washingtonpost.com/technology/2021/04/06/faceboo...
I've struggled with people telling me that these FAANG companies have "diversity problems," as a person of color myself. A majority of software engineers are female and male immigrants from East Asia and South Asia. These population centers are some of the most diverse regions of the world. The engineers who have been hired by preparing for and passing these companies' selective merit based coding tests had to overcome adverse conditions in their home countries as well, including extreme poverty, starvation, and totalitarian regimes.
Why do they not count toward diversity, to some white and white-adjacent critics? What message are we sending to people who are ethnic minorities from certain groups who earned their spots through merit and have also been targeted in recent newsworthy attacks, just as others have, when we make these kinds of accusations? What does a non problematic ethnic composition look like? What are these companies doing right toward some minority groups and wrong towards others?
If that is the case, why is it that Google voice nav routinely butchers the names of places and roads in India in spite of having thousands of Indian engineers on staff?
Could we blame the intractability of the problem, or just plain old incompetence, before we blame every single problem in the world on racism and lack of 'diversity'?
Silly Google TTS, the proper pronunciation is obviously "Malcolm the Tenth" there.
Once you search for these:
https://www.google.com/search?q=human+female+face&tbm=isch
https://www.google.com/search?q=human+male+face&tbm=isch
You can see that 'human face' has a bit of post-hoc tuning.
There's no super reliable way to prevent this (with current tech) other than forbidding that output entirely.
https://i.ibb.co/Mf6rVdf/Screenshot-20210907-002516-Photos.j...
Nobody who has traveled at all would mistake my wife and child as Japanese. And doing so is especially insidious considering the Bataan death march.
They probably are, but not good enough. These things can be surprisingly hard to detect. Post hoc it is easy to see the bias, but it isn't so easy before you deploy the models.
If we take racial connotations out of it then we could say that the algorithm is doing quite well because it got the larger hierarchical class correct, primate. The algorithm doesn't know the racial connotations, it just knows the data and what metric you were seeking. BUT considering the racial and historical context this is NOT an acceptable answer (not even close).
I've made a few comments in the past about bias and how many machine learning people are deploying models without understanding them. This is what happens when you don't try to understand statistics and particularly long tail distributions. gumboshoes mentioned that Google just removed the primate type labels. That's a solution, but honestly not a great one (technically speaking). But this solution is far easier than technically fixing the problem (I'd wager that putting a strong loss penalty for misclassifiying a black person as an ape is not enough). If you follow the links from jcims then you might notice that a lot of those faces are white. Would it be all that surprising if Google trained from the FFHQ (Flickr) Dataset?[0] A dataset known to have a strong bias towards white faces. We actually saw that when Pulse[1] turned Obama white (do note that if you didn't know the left picture was a black person and who they were that this is a decent (key word) representation). So it is pretty likely that _some_ problems could simply be fixed by better datasets (This part of the LeCunn controversy last year).
Though datasets aren't the only problems here. ML can algorithmically highlight bias in datasets. Often research papers are metric hacking, or going for the highest accuracy that they can get[2]. This leaderboardism undermines some of the usage and often there's a disconnect between researchers and those in production. With large and complex datasets we might be targeting leaderboard scores until we have a sufficient accuracy on that dataset before we start focusing on bias on that dataset (or more often we, sadly, just move to a more complex dataset and start the whole process over again). There's not many people working on the biased aspects of ML systems (both in data bias and algorithmic bias), but as more people are putting these tools into production we're running into walls. Many of these people are not thinking about how these models are trained or the bias that they contain. They go to the leaderboard and pick the best pre-trained model and hit go, maybe tuning on their dataset. Tuning doesn't eliminate the bias in the pre-training (it can actually amplify it!). ~~Money~~Scale is NOT all you need, as GAMF often tries to sell. (or some try to sell augmentation as all you need)
These problems won't be solved without significant research into both data and algorithmic bias. They won't be solved until those in production also understand these principles and robust testing methods are created to find these biases. Until people understand that a good ImageNet (or even JFT-300M) score doesn't mean your model will generalize well to real world data (though there is a correlation).
So with that in mind, I'll make a prediction that rather than seeing fewer cases of these mistakes rather we're going to see more (I'd actually argue that there's a lot of this currently happening that you just don't see). The AI hype isn't dying down and more people are entering that don't want to learn the math. "Throw a neural net at it" is not and never will be the answer. Anyone saying that is selling snake oil.
I don't want people to think I'm anti-ML. In fact I'm a ML researcher. But there's a hard reality we need to face in our field. We've made a lot of progress in the last decade that is very exciting, but we've got a long way to go as well. We can't just have everyone focusing on leaderboard scores and expect to solve our problems.
[0] https://github.com/NVlabs/ffhq-dataset
[1] https://twitter.com/Chicken3gg/status/1274314622447820801
[2] https://twitter.com/emilymbender/status/1434874728682901507
i wonder how testing for that looks and sounds in corporate environment. It may as well be an area similar to patents - you pretend that you never heard, never discussed, God forbid any mentioning in corporate email/chat/etc. or clicking on a link from inside a corporate network,...
Have we considered AI and ML as a general brain replacement is a failed idea? That we humans feel we are so smart we can recreate or exceed millions of year evolution of a human brain?
I'd never call AI a waste, it's not. But getting it to do human things just may be.
Even a child can tell the difference between a human of any color and an ape. How many billions have been spent trying, and failing, to exceed the bar of the thoughts of a human child?
Primates and humans are similar labels. This was almost certainly not intentional. Video classifiers are going to make mistakes - sometimes crude or offensive ones. I don't get outrage over labeling errors like this. Facebook should fix the issue - but they shouldn't apologize. It only encourages grievance seekers.
In every aspect of your life
No, I think it's racist because racists have a long history of calling black people primates, and because an automated system doesn't get to escape scrutiny and critique just because someone didn't specifically put in a line of code that emulates the actions of racists.
I understand that fb is a much bigger scale, but all the reason to have a much more diverse set of eyes to test their models before they go live.
If you want to avoid this, hire more black people, seriously.
I guess first step might be to "hire more black QA people".
"Oh, maybe we should look into that"
AI models are deterministic in a purely technical sense, but practically speaking, they are non-deterministic black boxes. It’s not as if you can write a unit test which generates all possible videos of black people and makes sure it never outputs “gorilla”.
On the other hand, imagine a world where these labels were applied by a massive team of humans instead of a deep learning algorithm. At Facebook's scale, would the photos end up with more or less racist labels on average over time? My guess is that the model does a better job, but this is just another example of why we should be wary about trusting ML systems with important work.
One worries that the corporate overlords are preparing the legal system for completely impune manufacturers of self-driving cars. "Sorry your child is dead; the car did it so there's no one to sue or convict."
I would say it's both. It's embarrassing for Facebook because it looks racist even though it really isn't. The system might be emotionless but the people who interact with it aren't, and we don't expect them to be.
Instead, I want to talk about pareidolia. Humans are social creatures. We have evolved to identify others of our kind and read their expressions. This was important to us, as we evolved alongside gorilla analogues as well, and the few of us that couldn't discern one face from another didn't usually last long.
I think we're trying to place too much of a human expectation onto these machines. I think that human features and primate features are strikingly similar, and it's our specialized brains that let us so easily discern. Yes, with enough data and training we could have more accurate models, but we can't cry foul everytime an algorithm doesn't behave like a human does.
Reference: https://www.reddit.com/r/Pareidolia/
Facebook disabled Thai-to-English translation back in April because it translated the queen as “slut” and it’s been disabled since.
Maybe we should learn to accept non-fatal errors from applications instead of forcing things to stop entirely.
I find it ridiculous that my Photos app suggests I change monkey to “lemur” while I have plenty of photos of monkeys and zero of lemurs.
If you shine enough light on it, apparently the brand does. If a human were to do this, the company would immediately fire the employee and cut all ties with them. But as the article points out, 'fixing' an AI mistake isn't really a fix at all:
> [Google] said it was "appalled and genuinely sorry", though its fix, Wired reported in 2018, was simply to censor photo searches and tags for the word "gorilla".
I took a photo of the water pump from a car windscreen wiper and google was able to correctly identify what it was. I took a photo of a generic PCB which showed the back of a driver board for an LCD and google was able to bring up the exact type of board it was with the names of the ICs on it.
In these examples, google photos ai has far exceeded what the average human can achieve. We just have to keep in mind that these systems are not perfect and only a best guess which should be verified by a person later.
The problem here is not that the mistake was very costly or disruptive to the function of the feature, but that the mistake was highly offensive which is something very hard to avoid.
The problem it's solving is that it can do things that somebody with zero experience cannot. If you had an auto parts pro, or an EE, they probably could have done the same for you.
So, in general, AI is helpful because it has a much larger breadth of knowledge. Granted.
But I want examples of it doing depth, too.
My wife uses Lens when we fish. It's way, way worse than a fisherman with any experience at all.
Yes. It is currently known to fail at this prospect. It is an open research question as to whether current methods can be merely "scaled up" using more compute to achieve "general brain replacement". I personally am skeptical about that considering basic problems such as concept drift (but I am by no means an expert).
You define what constitutes as valuable to be arbitrarily difficult/inconceivable with current methods (because it's an area of open research) and then say we should divert course merely because we don't know it's possible?
> never call AI a waste, it's not. But getting it to do human things just may be.
It already can do things thought to be previously exclusively "human" (such as beating Go). Recently it also helped make significant advancements for protein folding which are sure to yield benefits to medical science at least indirectly. I believe this statement is either incorrect, or you're expecting people to have some strange definition of "exclusively human", which is of course also open research and unanswered.
Humans and machines are so different today. Of course machines beat us at number calculations and such. But we have organs that computers don't and can't have. And our brains are much more in tune with using those than power of 2 bit twiddling.
As we ourselves don't understand how it works, how can we ever write a machine that does?
Taken to the extreme, AI code is essentially something like:
add(M, N) {
return M + N + rand();
}
In addition, being tested with a (in relation to the complete set) very small set of input data.Maybe to your typical SGD-type algorithm, working off a dataset filled with mostly light skin toned people, skin tone just looks like a real solid first-order way to distinguish humans and primates, and picking up the black people / primate distinction seems much more marginal and second-order, in terms of impact on the cost function.
If most of the people in the dataset were black, I predict you wouldn't see this.
I don't know Facebook's TOS sufficiently to know whether they are using private groups as source material, but if you're utilizing bigoted content to train pattern recognition, you will replicate bigoted content.
The AI is not that smart and these examples show it.
Humans are primates. It's weird that it selected such a broad label, but it didn't select an incorrect label.
e: I assume something similar has been done before by training a model on brown/black bears then throwing polar bears at it. Anyone know the outcome?
When I was quite young, I referred to some firefighters as robots.
which says a lot about the state of our alleged human outperforming AI
And I'd like to see a gorilla in any pose that's really hard (for a human) to differentiate from a person.
The truth is: the recognition algorithm is not very sophisticated after all.
So, this is going to happen.
Humans with a lot of experience are. Would kids be? I once referred to firefighter as robots as a kid.
Please do not trivialize acts that have the potential to cut humans so deep with handwavy substantiations. Facebook should have known better, and done better.
When you have an automated system that has irregular behavior to a given input, we call that a bug. Bugs exist in all software, not always unique, but always present. This software is no different than any other. It will have errors. Because the software is categorizing faces, its errors will result in miscategorizing them. The only relevant questions to this are how frequent these errors are and how disparate they are across racial lines.
Another reference: this one is a Tool-Assisted Speedrun of a game that relies on basic image recognition software. While not entirely related, it does show how error-prone these algorithms can be. It's also fun to watch. https://youtu.be/mSFHKAvTGNk
Nobody likes the stories. No reasonable person is celebrating them. You’re not in disagreement with anyone.
These stories are about how we also deeply care about labels and categorization. Aren't we just looking at the natural selection (making them not "last long") of these way too rough AIs that step on bounderies that are pretty important to a lot of people ?
Oh well, it's the times we live in.
If people simply laughed at the results and fixed the problems they'd miss all the endorphin rush of outrage.
From what I can tell the only fix here is a hardcoded workaround outside the net, or a substantially more powerful architecture.
I think the conversation can be made a lot simpler.
AI isn’t ready for anything important. Done. That’s it. If one of the pioneers in the field can’t determine black peoples from primates - it isn’t ready for driving or war or legal matters or really anything of importance.
I think we (colloquial) made something kinda cool and jumped the gun on when and where to use it.
It’s impossible to test every image for accuracy and to guarantee it won’t happen again, so they just sidestep it entirely.
Does not change anything about my original statement.
The end outcome of “oops I bumped into you” is basically nothing. The end outcome of an intentional shove is quite different (maybe lack of trust develops, etc)
The end outcome of punching someone in the face while drunk and saying “hey brah I didn’t mean to” is still gonna feel shitty no matter the intent
Doesn’t change the original statement I made one bit
https://www.nytimes.com/2021/09/03/technology/facebook-ai-ra...
I said it's not a waste. Not at all, I use it in a lot of the ways you describe.
It absolutely is racist. Racist outcomes are still racist regardless of whether there's a guy in Klansmen robes at the steering wheel or not.
Also, a 'simple error' performed by company with absurd amounts of money and several extremely public examples from its peer companies as to what not to do is, at that point, more negligence than anything.
Yes, I understand what systemic racism and implicit bias are, your condescending snark is appreciated.
Anyway, it's not racist because the result is not the product of implicit bias or systemic racism, it's a software bug that would have been possible no matter who was working on this software. As I wrote in another comment: the whole point of ML is to adapt to what is effectively an unbounded set of inputs, pretty much by definition there will be cases where even a team of 100% black people will train a model that, given the correct input, will fail in ways that particularly affect black people.
In my experience there are a lot of bigoted things on Facebook. If these are serving as source data, and are sufficiently distinguished from other training material, it may well be user behavior the ML system would replicate.
It forces a worldview where malice is the default assumption and encourages the "enemies all around us" mindset.
But it is an excellent question why Google Maps is still terrible at Indian place names even though they have plenty of people internally who not only could help, but would be delighted to. The answer to that will be essentially sociological. If you think that answer in no way includes structural inequity despite it being pervasive in America since its founding, you will have to explain how you think Google managed to eliminate that in the Maps division and then managed to re-introduce some sort of structure that leaves a wealth of internal knowledge untapped.
America is not unique in this. And African-Americans are not the only people in the world who were enslaved. What is unique is that America and Americans are so good at controlling narratives and sucking oxygen out of rooms that other stories and catastrophes are forced into irrelevance.
It would be interesting to test a bunch of midwesterners at their ability to tell Asians apart or to be able to distinguish various Asian ethnicities. My guess is that a lot of the distinguishing features that they look for are altered or missing.
It's true that we're good at recognizing faces (even where there are none), and distinguishing on a basic level (type of animal) but specific faces are mostly cultural.
https://gizmodo.com/why-cant-this-soap-dispenser-identify-da...
(And when you click "why" you get a picture of Arabic text, which can't be copy/pasted into translation software)
Shame on you for distributing false information, it's a good thing we have facebook protecting The Truth. /s
But the article also says that a counterargument could be that the existence of machines that aren't very suited to a big part of the population can be seen as proof of some latent Racism (to be more accurate, discrimination is closer to what's used in the article) whether intentional or not.
Obviously the people at google had thought of having more training data.
Or how about this one: Yes, all black engineers on the maps team live in New York.
Truth is that it is just an example one of the thousands of edge-cases that exist in these types of complex products, and some of them will look like they have some sinister basis.
https://en.wikipedia.org/wiki/Cross-race_effect
So at some level it breaks down for us too.
Sure it is. But if Facebook, Google and other American companies want to indulge their Americentric proclivities to the detriment of everyone else, they should voluntarily withdraw from the rest of the world.
Not to imply the problem is unsolvable, just that if an institution has zero tolerance for this mistake, the fix your describing is no guarantee it won't occur.
Say you have 1000 classification targets. You have to produce a model that checks, for each target, the odds of it being classified as one of other 999.
You have to check, specifically, for "adult male as primate" out of a million potential combinations. And apply secondary business rules or optimizations to prevent that classifications.
So yes it's possible, but it's not cheap, simple or easy.
Facebook just decided to shove the model out the door and not worry about the consequences.
Quality engineering work, costs money and time. Facebook didn't spend it.
Put more frankly, the success of recent immigrants does not erase America's long history of brutality and exploitation toward blacks and Latin Americans. The latter is a problem that we have to solve regardless.
And I think it's worth noting that some of the immigrants have brought their own biases with them, such that caste discrimination is now also a problem in Silicon Valley: https://www.washingtonpost.com/technology/2020/10/27/indian-...
But given that America was far more brutal and exploitative towards Chinese immigrants than towards Latin Americans, why are Latinos so prioritized by these initiatives to favor certain racial groups?
> And I think it's worth noting that some of the immigrants have brought their own biases with them, such that caste discrimination is now also a problem in Silicon Valley: https://www.washingtonpost.com/technology/2020/10/27/indian-...
Ironic that in a discussion about diversity, you believe in a prejudiced stereotype about a major ethnic group in Silicon Valley. Casteism is pretty much a nonissue in Silicon Valley, if only for the simple reason that most Indian-Americans tend to be ignorant about the castes of most other Indian-Americans.
You might not share the beliefs of others that are gainfully rallying behind diversity as a cause to justify penalizing some minority groups for "doing too well" and bolstering others (the literal definition of discrimination), but it IS happening -- and certainly more people than "nobody" are backing it, provoking my original statements. Someone had to put Prop 16 on the ballot, for example (which was thankfully voted against by a large margin of fellow CA Democrats).
Turning it back on you, what should the point of a diversity program be? What's meant to be achieved outside those three goals?
You should also look up the extensive critiques of meritocracy as a concept. There's a lot of literature there.
Further, I know of no major tech company who uses a nominally "objective coding test" as the only criterion for hiring. And they shouldn't, because being good at taking coding tests is not the job and not what we should be hiring for.
The companies are then righting the wrongs on the shoulders of innocents, that most likely never were racists to begin with. In short, just committing to another mistake.
South Asia and SE Asia, maybe. But East Asia (NE China, Korea, Japan) has actually one of the most ethnically "pure" populations in the world.
Northeast China–usually defined as the provinces of Liaoning, Jilin, and Heilongjiang–does not belong on your list. According to the 2000 Chinese census, about 10% of the population of Northeast China comes from ethnic minorities – the majority of whom are Manchus, but also including significant numbers of Mongols and Koreans. That is far from being 'one of the most ethnically "pure" populations in the world'-especially when compared to Japan or Korea.
Indeed, even though Northeast China was (in 2000) approximately 90% Han, prior to the 19th century Han were a minority in the region, and Manchus were the numerically (and politically) dominant ethnic group.
According to the 2000 Census, the most ethnically homogenous part of China is not the North or Northeast, but rather Eastern China, which is over 99% Han (and, as well as being over 99% Han overall, 4 of its 7 provinces are over 99% Han too.) By contrast, North China is about 94% Han and Northeast China is only around 90% Han.
(There have been two Chinese censuses since, in 2010 and 2020, but I can't find ethnicity figures for them.)
If you want to argue for the general case, you can simply prove the negation is false. Since it is incorrect to say that a network trained with a tiny percentage of possible inputs will never misclassify, it is true that a network trained in such a way will eventually misclassify. This is bolstered by training any network and seeing they always will misclassify something.
> Yes they did, I said that.
You didn't say that. You said a misclassification would happen on some inputs. That is different from saying on these specific inputs.
So yea, I think it is an issue with generating a data set and not hitting a sufficient amount of test cases. Because in this instance, asians would be an edge case where creating a small data set to train an algorithm on with a group with a lower representation in the population.
Let's separate the general case from the specific. Generally, we know that representation in the people who make things changes what they make. This is obvious and undeniable. For example, look at ASCII vs Unicode. The Chinese invented movable type 500 years before Gutenberg, so it's not like the idea of printing non-roman characters was novel. In the age of telegraphy, Europeans developed encodings that included umlauts and accents; by 1851 they were merged into International Morse Code.
So why in 1963 was ASCII codified without any of that? And why did that become the dominant standard for an extended period? Because it was mainly Americans in the rooms where the technology was being created.
Similarly, we know that standard color films were developed by white people to represent white people well: https://www.vox.com/2015/9/18/9348821/photography-race-bias
And we all know how this happens. It's the same reason a lot of open-source software is good for a developer audience, not an end-user one: making things means iterating on them until they're good enough for the people involved.
That's the general case, so let's return to the specific case. If you want to prove that ML systems doing racist stuff has nothing to do with who made it, then you can't just handwave it away. You have to show why that specific project was set up so carefully and so well that it would avoid the natural pitfalls of any technology project. And then despite that it went on to do racist stuff. For reasons that you'd then have to explain.
'Eleven Jinping': Indian TV fires anchor over blooper.[1]
That would be a valid reason, but I suspect a more culturally appropriate one: loss of reputation. We are sensitive to that.
My point was this isn't something that only goes on in 'white' brains but more of a cultural issue. Most people in the West are incapable of pronouncing Asian names. I don't see people making a big issue out of it.
Or do you think they had a team (on a completely different project or perhaps company) write a text to speech function that wasn't well suited for directions.
Streets have lots of numbers after all. People frequently have numbers in their name.
(Leave the software engineering to the software engineers)
I wager that there is more text online about Louis XIV than of Malcom X. Certainly there are many more books on that epic corner of French history than one modern US leader. Then there are all the British kings. Point an AI at the internet and it likely would decide that roman numerals are most often pronounced as number than letters. Malcom X would be rare an exception that might need to be hard coded.
I personally have gotten bugs fixed at Google. How? Because I, a white man, spotted a bug, cared about it, and talked to white men of my acquaintance at Google who had enough power to get things done. How did I know them? From other tech companies created, run, and majority staffed by other white men.
Why am I in these networks at all? Well, my dad was a software developer and he introduced me early on. How did he get his start? His dad, an insurance company exec, brought him in to deal with this newfangled computer thing they had just gotten. That was in Milwaukee in the mid-1960s. I promise you that although Milwaukee had a significant black population, exactly zero of them were insurance company executives in the mid-1960s.
So what Allie Bland knew when she wrote her tweet was that she did not have any connection to Google where she might be able to get a to-her glaringly obvious pronunciation issue fixed. That in her estimation no black person did. And I see no reason to think she was wrong.
Coding tests examine the type of work actually required to be done on the job (as coders), and they have been correlated with post-hire performance successfully. Someone who is not familiar with efficient data structures will not write scalable code and will end up creating a burden on their teammates during on-call, for example. Asking someone to solve an engineering problem with a provably correct answer is an objective test for hiring engineers, and I will have a difficult time continuing to engage with anyone who counteracts this basis of reality and truth.
When I was hired there were three coding test rounds and one interpersonal round. You might argue that the latter is where racial discrimination seeps in, as well as the recruiter outreach step itself, but somehow I am optimistic that a bunch of tolerant Californians have moved past applying a Literacy Test here already by hiring a majority immigrant / minority workforce. In my situation, my recruiter was also an Asian-American minority.
Since my comments here don't seem to be making any sense to you, I'm not seeing the point in trying again.
Long ago I learned that it was rarely worth my time to try to argue online people out of their ignorance. A rando with a throwaway account, a strident tone, and a fair bit of ignorance on the topic is almost a guarantee that that's no point.
If you're interested in knowing something about the topic, you'll do some work. If you aren't, no amount of me spoon-feeding you summaries of serious scholarly works will change that.
If you do end up learning something and have questions, feel free to email me. I'm glad to discuss the topics with people who are serious about it.