Part of my code makes Copilot crash(github.com) |
Part of my code makes Copilot crash(github.com) |
You can see it in the original link to the discussion: Answer selected by davecheney
Now I’ve got Gen-Z developers that are confused and upset when `git init` does what it’s always done.
GitHub, Microsoft ownership notwithstanding, was always going to inject its employees’ politics into Copilot.
Except some people want to punish others for their opinions. That is the gasoline. And Microsoft is selling gas cans.
It’s a comment from a third party speculating over what causes the crash.
> Heargo 24 days ago > Thanks, I'll try as soon as I get the problem again (somehow it's not bugged anymore...).
Looks like it was just a temporary issue with no evidence that's it's due to a word filter.
https://twitter.com/moyix/status/1433254293352730628
The Register wrote about it too: https://www.theregister.com/2021/09/02/github_copilot_banned...
They have since moved the bad word list server-side to prevent people from figuring out what's on it, but it's still there. This is easy to verify, just ask it to complete something that would include a banned word; my favorite here is "Israel", and it will just sit there and refuse to complete, either via inline suggestions or in the sidebar view that gives you 10 choices:
https://i.imgur.com/O97YwKc.png
This was what I managed to decode of the list (in ROT13 form to prevent accidental offense):
https://moyix.net/~moyix/copilot_slurs_rot13.txt
No doubt they've added and removed some things since then.
(Because I had to look it up too: "banana bender" is a humorous term for an inhabitant of Queensland, Australia. It doesn't appear to be considered offensive at all.)
1. Copilot makes a suggestion that implies gender is binary, a certain community explodes with anger and an entire news hype cycle starts about how Microsoft is enforcing views on gender with code.
2. Copilot makes a suggestion that implies gender is nonbinary, a certain community explodes with anger and an entire news hype cycle starts...
You can't win... so why not plea the fifth?
To all those claiming this is an example of "wokeism", remember the proper response from an individual who believes in nonbinary gender would be to offer suggestions of the sort. There is no advocacy here. Mums the word.
Certain words are heavily loaded and are worth just skipping to avoid all the hassle for now.
If you didn't meant "should be" (for which I'm not willing to take any position), no, Copilot is not a product for adults [1] [2].
[1] https://docs.github.com/en/site-policy/github-terms/github-t... "A User must be at least 13 years of age."
[2] https://docs.github.com/en/site-policy/github-terms/github-t...
What I would've preferred one of these threads to be about is how all of this works. Like, how do they post-hoc filter certain things? Is that the only way to deal with things defined as issues in ML?
You can't be neutral on a moving train, as they say.
If I were Microsoft, I would post a shrugie and say copilot offers arbitrary responses based on the actual code it reads; it is not supposed to be "correct" or good or fair, but just follow what it sees other people do.
While I agree with you, that is very much the game that is being played here. We have competing world views and one way to help a world view dominate is to play a linguistic war. That was the point of Newspeak in 1984 (https://en.wikipedia.org/wiki/Newspeak). If you control the language such that competing ideas are instantly taboo just by the words required to describe them you can stop people from promulgating those ideas. So you gain ground without ever having to debate the new ideas.
This has happened in many countries when one religion dominated. Western society was starting to get to the point where it taboos were being shed and ideas could win based on their merit. Sadly we're regressing back to a society controlled by dogma rather than an open exchange of ideas. I suspect this is the normal state of human societies, we fluctuate between open and closed societies.
The last time Microsoft did that, they ended up with their bot posting racist content on twitter. They of all people understand that just following what people do on the internet is a recipe for disaster.
The idea of science is to get rid of models that are wrong.
A certain community explodes with anger since their machine learning dev-tooling is closed and has arbitrary restrictions.
If you try to please everybody, someone won't like it.
let's not handle ethnicity, if we're going to be sensitive about gender that is an area which is also sensitive for many people.
should it take border disputes etc. into consideration, if you're using it in country X and country X thinks a particular area belongs to them despite most of the world disagreeing will you not be able to use copilot to generate code that supports your remote employers international operations?
it would make better sense if Copilot had warnings it could issue and when you wanted gender put up some sort of warning about that - or allow you to choose binary gender / multi gender solutions.
The idea that it should fail, and that makes sense for it to do so is essentially a critique of the whole code generation idea.
on edit: obviously HN should be able to come up with lots of other things that might cause media related problems if CoPilot handled it, code to detect slurs, etc. etc.
How would that work though? What can Copilot suggest that can imply that?
If gender is true
Do something…
Else if gender is not true
Do something else
Else
Do nothingBut I agree you can't avoid offending people. The world is nuts everything is offensive to someone.
And also: give the list of banned words
You have made up a total strawman. It is like if someone said "If that person were stabbed with a knife, they would be angry", and you responded "Do people really get angry at emotionless knives? That's a mental problem, their anger is directed inward".
We're zeroing in on how silly is it for copilot to trigger its content filter on the word "gender".
To me the real issue is that copilot has a content filter in the first place. It's unwelcome and unnecessary.
"What was going on in the head of the person writing the parser?"
I mean, were they thinking that if someone is writing code, let's say, for a gender dropdown and it was only ["male", "female"], it would try to suggest to us to add 26 more genders instead (and worse, suggest a list of genders to add)?
Would the intention be to correct us and popup a message saying "We suggest you add more genders so as not to displease the users of your product"??
What was going on in that person's head who is trying to do all of this? What was their thought process? What were they trying to accomplish around gender?
Was it the programmer, or some product manager that insisted on some kind of "copilot adjustment" for this because of a personal political viewpoint or just for GitHub being more woke?
That's the most troubling aspect to this.
I hope to Jesus Christ it was just a mistake.
If your autocomplete was capable of spitting out suggestions that made you feel isolated or kept poking you in the eye about aspects of your identity, you might feel a bit better about the creators having thought about that and taken steps to avoid it happening.
Gender is, in actual material fact, binary, and extremely strongly correlated with sex. Building a crimestop into an ML model is just teaching the machine human biases and delusions.
A metaphor for our times.
Looking back, I don't even know why I made it an enum, rather than a 1-bit bitfield called is_woman - but in the end I was glad I didn't, because the art director moaned a bit about the clothing colour distribution, and somebody asked if we could have some mascots, and there were some complaints about the unreasonable number of interesting hats. And, so, long story short, by the time we were done, we had 18 genders based on clothing colour and type of hat, 2 genders for mascot (naturally: tall, and squat), and a table to control the relative distributions.
Once we got to 5 genders I tried to change the enum name to Type - but we had this data-driven reflection system that integrated with various parts of the art pipeline, and once your enum had a name, that was pretty much that. You were stuck with it.
Is that a metaphor for our times too? I don't know. My own view is that sometimes stuff just happens, and you can't read too much into it.
Interestingly, I don't know of any zoological cases that would require more than a short int to enumerate.
> A metaphor for our times.
Social media amplifies an innocuous, extremely low stakes occurrence into a heated discussion because it happened to misstate the facts (nothing is crashing here) and focus on a hot button keyword ("gender" is only one of many blocked words)?
More and more things are going to be filtered through large language model apps and the possibilities for cascading failures will be even more interesting than what exists presently.
I was able to get GPT-3 to spit out reasonably accurate biographies for a couple of composers I know.
GPT-3 could go even further — one of my composer friends has a reasonably rare first name, and when given the prompt "There once was a man named $first_name", GPT-3 responded with a number of limericks tailored to his particular set of skills.
There once was a man named $first_name,
Who never accepted the blame.
He went on a bender,
And talked about gender
[INFO] [default] [2022-07-10T07:59:07.641Z] [fetchCompletions] engine https://copilot-proxy.githubusercontent.com/v1/engines/copilot-codexI would not be surprised if someone found some Copilot output stemming from "gender" and reported to MSFT/GitHub for them to simply short circuit or "break" after finding certain keywords.
For images/ video I can see merit, ex: using that nudity inference project on images of children, but text seems particularly pointless.
Perhaps Github is worried about a backlash if it suggests code that allows for more than 2 values.
I'm pretty sure a bot would swoop in and say something like "NO LOL" which ironically only encourage more LOL.
Does GitHub Copilot produce offensive outputs?
GitHub Copilot includes filters to block offensive language in the prompts and to avoid synthesizing suggestions in sensitive contexts. We continue to work on improving the filter system to more intelligently detect and remove offensive outputs. However, due to the novel space of code safety, GitHub Copilot may sometimes produce undesired output. If you see offensive outputs, please report them directly to copilot-safety@github.com so that we can improve our safeguards. GitHub takes this challenge very seriously and we are committed to addressing it.
The bugs apparent trigger word is close to hot-button poli-sci issue. Can we please focus on the Technology.
I totally agree that this story has a high risk of flamewars.
But it definitely has heavy Technology component, too.
Are there any other break-words? Master, slave, Carlin's seven words, etc?
This means one solution for those worried about copilot laundering around code licenses is to put a statement like "for more details check the man page" at the end of each docstring.
function genderPrintResult (GenderBool)
if GenderBool: print "Yes"
else: print "No"
GenderMyVar = rand(10);
GenderThreshhold = 5;
genderPrintResult( GenderMyVar > GenderThreshold)You literally can't make any statements about gender, no matter how benign, without pissing at least a few of your users off.
https://twitter.com/moyix/status/1433254293352730628?t=NIpgb...
Anyone have any good recommendations for Copilot alternatives?
They refer to it as “eliminating bias”, but it’s really just an attempt to mold these new technologies into conformance with one very specific set of ideological commitments.
Proponents view it as some kind of obvious universal good, and are confused when anyone else is appalled by the blind foolishness of it all.
I don't think, e.g. being able to handle black faces correctly is some sort of massive ideological commitment. So let's not pretend that the entire concern of bias in AI is irrelevant, no matter where you stand on gender.
> conformance with one very specific set of ideological commitments
You know-- let's just talk about basic respect and dignity: if someone strongly wants to be referred to in a particular way, the polite response is to respect their wishes. If there's a lot of people in this category, it makes sense for your system to address it.
If you instead build your system in a way that you don't achieve this, you're being rude. If you use old training data and refer to people as a "Mongoloid" as a result-- don't be surprised that people are offended. Ditto, if you use old training data about gender that doesn't match many peoples' current expectations.
It is quite literally creating bias.
"Microsoft's AI Twitter bot goes dark after racist, sexist tweets"
https://www.reuters.com/article/us-microsoft-twitter-bot-idU...
Whatever text I write in Word, is written by Microsoft too, by extension?
*not*
For example https://www.cbsnews.com/news/microsoft-shuts-down-ai-chatbot...
If you have to deal with those kind of people, you're willing to sound silly just to protect yourself.
The issue isn't that it isn't producing a suggestion, but that it stops producing a suggestion altogether for the rest of the file.
I don't use copilot anymore because it just results in poor quality code and additional cognitive overhead (because you need to read and discard the shitty suggestions) as you type. It both slows you down and exhausts you. So you can really think of this as a feature. You'll write much better code as soon as copilot shuts down. It should do this more often.
> Would the intention be to correct us and popup a message saying "We suggest you add more genders so as not to displease the users of your product"??
You can just as easily assume that they don't want a dropdown with 26 additional genders to just pop up automatically. That would upset a lot of people, many of whom are in this thread. I think whoever wrote the code doesn't want to jump into a political shitstorm.
Hurting the feelings of the true believers, was the ultimate sin, a sin often committed, but only punished if the sinner did not recant and change his ways, in a brutally public and official way. It was there, that the ____ church revealed what it was really all about all along. Societal control, maybe with good intentions to start with, but in the end, just control for its own sake and to prevent others from archieving the same control.
Not saying, that any social movement could turn into a religion. That would need strange clothing, processions, rituals, codified language and most of all a mythology.
I have no religious preference, im on the side of science and would like to have a civil society, were no member is violated by another. I would very much prefer it, if the combatant religions involved, could leave science alone. Reality is often disappointing.
May the religion with the least suffering caused win and then keep away from the state & power.
I think the goal of any sufficiently large society should be that any religion or ideology can rise. Many people can become a part of that, yet the religion or ideology is unable to persecute those how don't agree with it.
I also have no idea how you achieve that. It's my utopia and like most people's vision of a utopia is probably not possible in reality.
For example, a couple years ago, there was a big hubbub over a Google Image labeler that labeled a black man and woman as "gorillas". A mistake for sure, but the headlines about the algorithm being "racist" were wrong. The algorithm was certainly incorrect, and it could probably have been argued that one reason it was wrong is that its training set contained fewer black people than white people, but the algorithm was certainly unaware of the historical context around this being a racist description.
Similarly, in the early days of Google driving directions I remember one commenter saying something along the lines of "You can tell that no black engineers work at Google" because it pronounced "Malcolm X Boulevard" as "Malcolm 10 Boulevard". Of course, the vast majority of time you see a lone "X" in a street address it is pronounced "ten".
It's kind of analogous to the "uncanny valley" problem in graphics. When the algorithm gets things mostly right, people think of it as "human-like", and so when it makes a mistake, people attribute human logic to it (it's quite safe to assume that a human labeling a picture of black people as gorillas is racist), as opposed to the plain statistical inferences ML models make.
https://www.theverge.com/2016/3/24/11297050/tay-microsoft-ch...
File under not "Why we can't have nice things", but "downstream effects of why we can't have nice things".
This is my mental image of how company happy-hour-Fridays play out. It's one of the reasons I don't drink.
[And if you're curious, in fact I'm not fun at parties ;) ]
This is absolutely silly. Solid work GitHub team!
I wonder if this is to prevent it from accidentally processing PII or PHI data. Maybe someone else who didn’t get their account on some kind of cooldown can try it with “birthdate” or “DOB” or “SSN”. I highly doubt this has anything to do with gender being a controversial or blocked term for political reasons or something.
If people were trying to reproduce it and failing, I agree with you that would be a different story.
Are you unable to believe someone might think differently than you do, without it being explained away as "artificial voting"?
Edit to add: Nice, they edited their comment. Previously it accused HN of having an artificial voting conspiracy. So that is what my comment was about. I will not edit my original comment above.
I'm just honestly super exhausted by any of the insanity right now, not even only regarding this topic. It's just complete black-and-white thinking these days, no matter about what it is. Extremes only. The stronger your opinion the better, how else would you feel like you exist? Almost no one with a rational, centered overarching perspective. Twenty years ago 50% of the current population would've been considered as possibly having BPD.
Is it? To me it feels like it's getting worse and worse, but that might be my bubble.
Twitter is so idiotically designed that it just makes things worse and worse.
With Twitter, they don't distinguish between positive engagement (retweets, positive replies) and negative engagement (critical quote tweets, critical replies)... their algorithm just stupidly sees the engagement and amplifies the tweet.
There's no wonder that a lot of the most extremist politicians (on the right and left) built their followings on Twitter.
I don't mean to make it political, but the last president of the united states was able to build a massive following almost exclusively through one political platform... (which he later got banned from)
For some reason, YouTube's decided to jump on the same bandwagon by removing dislikes (if there is no feedback from dislikes, then people will stop clicking them).
Some people feel that wokeness is ruining the world. I can't really speak to that position because my political initialization was on the other side of the cultural gulf in America.
The way I have come to understand transgender issues is very much shaped by the political left, but also by a religious upbringing (Catholic, Jesuit). On the left, I am told that this is a human rights issue. I am inclined to believe that transgender people have a hard time in life. I am also inclined to believe that it is not a mental disorder, and I came to these conclusions through conversations with transgender people I have worked with in the past, as well as through what I learned in my psychology classes in high school and college.
I am a white male who was born that way, but I definitely know what it feels like to be ridiculed, to not belong and to feel that there is no right place for me in this world. I have been abused, made to feel small, ostracized and bullied. Those experiences have given me a pretty deep understanding of what suffering is, and how it can be caused. It has also softened me and made me pretty empathetic to others who feel they don't belong in this world.
As an example, I was once at a comedy show where a comedian made a transgender-adjacent joke. The humor of the joke was all in a stupid pun, and I thought it was pretty funny because I like stupid puns. But there was a transgender woman in the audience who got immediately angry. I don't remember exactly what she said, but it was something along the lines of "That's not funny, I'm sick of people like you shouting at me in the street!!". If I had to go though my life having people shouting at me in the streets of NYC because of how I looked vs how other people thought I should look, I may have responded in the same way. I thought the joke was funny, but for her it touched on some deeply painful memories of abuse, dragged them to the surface, and activated a lightning-quick temper. Perhaps if I'd been abused for as long as she, and in the same way, I wouldn't have thought the joke was funny either.
I understand people don't like being corrected, or told that they're wrong or that they're hateful. I don't think that is a productive way to bring about change; and yet, I have found myself picking fights with my parents, and getting generally nasty when they have failed to understand some value I have learned that I did not learn from them. That is obviously a bad thing, because the message they come away with is "what a jerk!" or "those damn lefties!". What I'd rather have people come away with after they hear me speak is something quite different. It was only after raging at my parents enough times that I decided I just wouldn't talk to my parents about politics. There is more right about my parents than there is wrong about them; they are getting older and their bodies will decline until they die. Most likely it will happen to them before it happens to me, at a time when I am able in body and mind, so I intend (even though I sometimes fail) to spend the rest of our time together as peacefully as possible.
I offer this earnestly in good faith. Sometimes the message gets muddied in the delivery, or because I get upset when I perceive (or sometimes, misperceive) that someone is being uncaring for those who are already suffering enough. I think I react that way because of my own history of abuse.
I am also open to hearing the other side of this story. I have attempted not to misrepresent $OTHER_SIDE's view of things. I am only speaking to why I have such strong feelings about this issue. I am sure others have equally strong feelings on another side, and I am open to hearing what that sounds like, provided the viewpoint is offered with respect.
We can be empathetic without placating some really tyrannical trends.
Gorilla and black just was the most politically charged one of the bunch.
(The other potentially politically charged one was some tendency to misclassify people of various levels of body fats as various animals.)
> Certain words are heavily loaded and are worth just skipping to avoid all the hassle for now.
If memory serves right, that was Google's pragmatic solution: if they detected a human in the picture, they 'manually' suppressed the animal classification.
So they lost being able to classify 'Bob and his dog' in return for not accidentally classifying a picture of just Alice as a picture of a seal.
My hope is that the people here, at least the level headed ones, will rise to positions of influence and not the people rioting at every chance.
I can understand the concern about powerful organizations imposing a viewpoint surreptitiously via a widely-used piece of software. That is definitely a reasonable fear, and if we allow concentrated power (MSFT here) to behave in that way, then we are in for trouble.
I'd argue that we wouldn't end up with tyranny, but rather feudalism. There are other powerful organizations that can push their own viewpoints, surreptitiously and overtly. It then becomes a game of who has the most resources and control over the flow of information. But while I prefer "feudalism" to "tyranny", I don't disagree that if propaganda was the aim here, it would be a bad thing.
I don't agree that GitHub's aim was to impose a viewpoint. I believe the aim was to avoid putting this tool in the middle of a very politically charged issue. For example, how do we know GitHub didn't make this decision like this: "We don't want 8 genders popping up in a <select>, because that will offend the 50% of the country who only believes there are two". We are seeing some evidence of this (and the other 50%) in this thread.
Finally, maybe they are trying to prevent people from spamming their learning model with politically-charged content. If that were the case, you could argue that they are just trying to prevent their programming tool from becoming a political warzone, with competing sides trying to train their viewpoints into the model. I admit I know very little about ML in general and Copilot in particular, so you'll have to bear with me if that sounds naiive. In any case, social media is an example of a tool that has become a political battleground, even though that wasn't the initial purpose. If preventing Copilot's politicization is GitHub's aim (and I have no evidence that it is), then I'd say that's a reasonable thing to want for product if you don't want it to become unpleasant-to-use before long.
So we have three hypotheses:
1. MSFT believe there are more than two genders, and want to impose that viewpoint
2. MSFT believe there are only two genders, and want to impose that viewpoint
3. MSFT wants to avoid having politically-charged content in their code-generation tool.
How can we point to one of these being more correct than the other?
I don't have any evidence to support any of the three. But 1 and 2 each make three assumptions (MSFT has one viewpoint on issue X; this viewpoint is Y; they want to impose it). Hypothesis 3 makes two assumptions (Gender is politically charged; let's avoid that in our product). That's all I can think of this late at night. I could be missing something.
Just a reminder that often reality is more complicated than we think. Names, numbers and upper/lower case are the usual examples.
No biologist would claim that the sex is constructed.
A 13 years old is perfectly capable of that, I know many 40 years old that aren't.
export const CLAIM_PAYLOAD_SCHEMA = Type.Object({
"iss": Type.Literal("my-app"),
"exp": Type.Integer(),
"sub": Type.String(),
"name": Type.String(),
"priv": Type.Integer({minimum:0, maximum: Privileges.All}),
"gender": Type. // No completion is available.
Additionally, I get "No completion is available." from copilot.el on every line after that one, but completing on lines before it does work. When removing "gender", it works again, e.g. suggesting `"iat": Type.Integer()` for that line. I don't actually plan on using "gender" in my tokens, but it is a bit frustrating that an arbitrary word can opaquely disable Copilot for the rest of the file.They just described almost identical behaviour but with an isolated test case. Yeah there’s no video or whatever but it does support the original diagnosis.
Then you do it and report back when he's wrong.
"Candidates must be from one or more of the following equity-seeking groups to apply: women, persons with disabilities, Indigenous peoples, and racialized groups"
Ignoring the potential offensiveness and YOLOing through it is swinging the bat wildly at every pitch.
What I'm suggesting is that Copilot should keep working when these words are present, but refuse to attach any significance to the specific word. This could probably be implemented by replacing the problematic words with randomly generated strings before processing the text, then swapping those strings back afterwards.
(It could be reasonable for Copilot to refuse to make suggestions at all if the output would contain truly offensive language, like unambiguous racial slurs or sexual terms. But "gender" clearly isn't that.)
Yah-- it's unfortunate but it's easy. It might be OK to tolerate it if it's clearly outside the range of tokens used in suggestions, but the filtering doesn't use tokenized stuff.
> What I'm suggesting is that Copilot should keep working when these words are present, but refuse to attach any significance to the specific word. This could probably be implemented by replacing the problematic words with randomly generated strings before processing the text, then swapping those strings back afterwards.
The problem is, the trained model is much smarter than the keyword-based filtering. If you just whiteout the watchwords, it still has a pretty good chance of gleaning context and making a commentary on gender that Microsoft would rather not deal with.
> (It could be reasonable for Copilot to refuse to make suggestions at all if the output would contain truly offensive language, like unambiguous racial slurs or sexual terms. But "gender" clearly isn't that.)
Right now the list is quite a large variety of things. Mostly racial slurs and sexual terms. But letting an AI ramble on after "blacks" is kind of dangerous, as are various gender-related terms that do have innocuous interpretations. It's easy to put words in the filter list and much harder to try and use nuance on these topics that even humans struggle with nuance around.
The problem seems fairly limited to the USA from where I stand.
> Western society was starting to get to the point where it taboos were being shed and ideas could win based on their merit.
Name a year that things were actually better. (In the 40's, before the civil rights movement? In the 80's, when queer people were still regularly oppressed and excluded from participation in society? Did ideas "win on their merit" when police beat up people in gay bars?)
Exactly that. This is the current conflict.
Things were not better in the 40s, action movies were better in the 80s :p.
Things were trending towards a more open society they had not become perfect by any stretch of the imagination. That trend, IMO, has reversed, due to the tactics involved. That is not to say some groups haven't benefited from this. There is a genuine drive to create a utopia here. However I fear the cure might be worse than the disease.
Maybe clippy could popup and say "it looks like you are writing hate speech" and offer some suggestions.
I guess if it's not explicitly censored then its just a bug that Microsoft can fix.
Why make it about my opinions?
He works for GitHub.
>GitHub Copilot includes filters to block offensive language in the prompts and to avoid synthesizing suggestions in sensitive contexts.
I think calling gender a sensitive context is not unreasonable.
It is very unreasonable, but it's also the truth. sigh
Oh, wouldn't you know it... Turns out that almost all code doing something important might be able to be interpreted as sensitive.
> It is quite literally creating bias.
None of the people who do it care. One of the deceptive tactics that's pretty common in contemporary political discourse is to corrupt definitions in order to enforce controversial ideology using anodyne language.
>> Proponents view it as some kind of obvious universal good, and are confused when anyone else is appalled by the blind foolishness of it all.
IMHO, that "confusion" is an act.
These activists are free to write their own code that conforms to their ideology.
In turn, you can guarantee both groups of people end up upset.
Right now both groups are trying to silence the other-- with school libraries, etc, in the crossfire being angrily denounced.
Present something weird that is also easily verifiable. If you are having a bit of a break and are using copilot you can try out a few things and post answers.
And now we have independent verification (unless you think all these usernames are just lying) and some interesting bits of info about copilot.
Disclaimer: I’m a queer non-binary leftist, and many of my community and loved ones are at least one of those. The closest thing to any “verboten” I’m aware of is “gender critical”, and as far as I can tell that’s mostly a term used by detractors in my own community, and even so it doesn’t reject usage of the term only usage of it distinguished from sex assigned at birth. The next most “verboten” I can think of is commonly referenced “wrong things programmers assume about ____” which generally offer no guidance other than not asking if you don’t need to know or offering open text input if you do, and in any case don’t represent a small powerful bubble of anything other than being a memorable link.
For some time I collected sources to things as the linked github issue but I had to stop as it made me unhappy now I try to ignore it and hope that I am no longer there when that type of thing hits my city.
It can be wrong or right but it is not making a judgment based on anything outside of math.
You are correct to say the training wasn't complete but that doesn't mean anyone did anything wrong, racist, or hateful... 99% of the time it's simply a mistake.
When you label things like that as racist instead of simply mistakes you water that word down to the point where it becomes meaningless.
The problem in the last 10+ yrs of outrage internet social justice is that in order to gain attention and get traction those involved have lumped so many things into terms like racism that that it eventually becomes so stretched it's meaningless.
This is a failure to understand centuries of history. It’s an understandable one, it’s one I used to relate more to and I probably still relate to it far more than I should.
The notion of racism requiring malice is so far from reality that similar defenses were dismissed almost a century ago in international tribunals which still shape the world.
It takes no malice to participate in racism. It only takes accepting it as given. This doesn’t have anything to do with anything that’s originated from the internet, from any perspective. It doesn’t make racism meaningless. Treating it that way does though.
“The” problem is that racism, as a societal background factor, is treated as the sea in which we swim, it’s “neutral” without an actor present to promote it. If it just “is”, no one is “at fault” and… the kicker, if your definition requires intent and there isn’t any intent for the specifics under question… it’s not just a mistake, it has defenses like these to shield and bolster it.
You can rail against “social justice” all you like, and I’m betting my response will show your railing resonates more here than it should. But your position is ahistorical and probably based in defensiveness about something you don’t need to defend.
Two people can make the exact same remark and one can be racist and one can be based on innocent ignorance/curiosity. A young white child is spending time with a black person for the first time and says "your hair is weird", is a vastly then if that same person said it while in high school and was bullying the black kid in class. The former isn't racist and the latter is.
I don't rail against social justice, progress is good and I think everyone of every creed / sexuality / gender / etc should be free to express themselves and live their best lives without being judged for who they were born or identify as.
What I do rail against though is the use of manipulating language to bully and harass people because a social credit / status / clout of trying to always be finding demons to expose is the norm. I personally believe that people who do this (often the "social justice warriors" so to speak) are root for most of the radicalization of BOTH sides of the political spectrum in the western world right now.
But, I also definitely do think there is something worse about someone who hates black people and uses a racial slur to describe them compared to a model trained on humanity doing the same, but certainly both are huge problems, and it can't slip my mind that the racist person was also just trained on humanity's racism
Then Microsoft should very clearly state this so that customers who don't like this ideology know that they are not desired, and can get away from Microsoft products as far as possible for them.
https://www.microsoft.com/design/inclusive/
Also check out their AI design related doc, which specifically mentions how they try to avoid association bias, for example, those related to gender:
https://www.microsoft.com/design/assets/inclusive/InclusiveD...
This is not normal, right? I mean, outside US.
Copilot, pretty clear from the context.
Why did you suggest THIS as an example of what hes talking about? He doesn't indicate that he disagrees with this case.
Furthermore,that sounds like a problem of having incomplete training data. Regardless, manually tweaking a model points to a failure in the process somewhere.
He seems to be pooh-poohing the entire idea of "eliminating bias" in AI. So I felt it was important to
* point out that there are clear cases of bias in AI no matter where you stand on gender
* move on to explain a closely related case (using historical speech about race could be offensive)
* use the lesson to show that using historical speech about gender could be problematic as well
> Furthermore,that sounds like a problem of having incomplete training data.
Training a model from historical data can only reflect historical approaches. The social conventions around gender are changing rapidly and are contentious.
> Regardless, manually tweaking a model points to a failure in the process somewhere.
Here, there's no manual tweaking of the model: merely a refusal to return results in an area where the model has proven problematic.
If you can't effectively train something from existing data then cherry picking results according to different values isnt going to fix it. Your example has quietly shifted from facial recognition of different races to speech about different races. I cant even be sure of what you're talking about other than the fact that you will oppose criticism of imparting political bias into models.
Would you also respect the wishes of a schizophrenic person, if they say much the same thing? If they say that they are actually an alien from outer space, would you play along?
In general, I would respect someone's wishes. If they want to be Mork from outer space, K.
Of course, there are some very limited cases where we may reasonably believe that playing along is harmful either to ourselves or to the other person. If there's a broad medical consensus that something is harmful to someone, then maybe we shouldn't do it.
A biologically female person who wants to be called "they," because they have decided they don't like the connotations attached to "she" right now, doesn't rise anywhere close to that in my opinion.
The programmer should be able to use whatever the hell terms they want to use in their program. If the customer base doesn't like it that's their right. But it's not the right of the damn language parser programmer.
This isn't a language parser.
This is a tool that suggests implementations of small portions of code.
If the training data is out of date, it's quite reasonable for people employing that model to decide it shouldn't return results based on the out-of-date training data.
Even about completely different things. If the output is C code containing gets(), maybe we should decline to return the result.
> he programmer should be able to use whatever the hell terms they want to use in their program.
Indeed, it leaves it completely up to the programmer by refusing to suggest an implementation that would favor either side of the debate.
It's not "out-of-date". That's just the kind of pilpul semantic framing that these activists engage in since "out-of-date" implies "bad". The data is just not in line with their artificially made-up ideology. A demand which one, even as the best "ally" in the world, could never satisfy anyway, since the grievance grifting relies on always coming up with new issues, you just have to look at the shift from "equality" to "equity" or from "microaggressions" to "nanoaggressions"
> The programmer should be able to use whatever the hell terms they want
> language parser
Are you unsure what copilot is?
But something like Copilot or DALL-E? If you ask DALL-E for a doctor and it rarely shows black people (or women), then it is neither racist nor broken. Our society is broken. There are not enough people in that job that are not white and male. Or they are not represented enough. I think there is value in AI that honestly reflects society, because it makes this discrepancy harder to ignore.
People imagined AI would be this benevolent, neutral, wise thing that would maybe be a bit naive but not have our human biases. But it turns out there is no "morally neutral". It will just reflect what you put into it.
But tend people tend to insult everybody that does not care so much about such a topic as racist.
Have you looked at the actual demographics of medical doctors in the US? 54% are women, and 35% are nonwhite. But when we have media depictions of doctors, I agree they tend to be white and male.
So, what should DALL-E conform to? Should it conform to A) our actual present society, B) the biased original dataset (which leans both towards the past and towards existing media biases), or C) some idealized version of society?
I got 12 white dudes, one Southeast Asian woman, one Southeast Asian looking man, and two men that I'm not sure of their race when I tried this just now (quite possibly white). This is despite OpenAI's efforts to debias it, and isn't representative of current physician demographics.
But if AI just represents and reinforces extant biases-- and worse, AI is used to produce art and text that ends up in other AIs training sets -- how do we ever get out of this mess? The people who produce, publish, and productize AI do have some degree of editorial responsibility.
> But it turns out there is no "morally neutral".
Of course not. Hume pointed out long ago that you can't transform positive statements into normative ones.
But all of this is a little offtopic, anyways. This is about when it's reasonable to refuse to return a result. "Hey, your answer had the N-word in it, and we know most of the time our model does that it's offensive-- so we're just not going to return a result, sorry." I think this is a reasonable path to take when you know that your model has some behaviors that are socially questionable.
What's the issue? I used to watch a lot of medical dramas on TV and in my opinion the black rockstar MDs are way overrepresented in comparison to their real-life numbers:
'5.0%' in 2018 in the US[1] in real-life vs. '19.4%'[2] on TV
[1]https://www.aamc.org/data-reports/workforce/interactive-data... [2]https://www.bluetoad.com/publication/?i=671309&article_id=37...
I think techno libertarian suggestions like these are dangerous because they assume there’s one “canonical” place to fix these issues and all other places can just reflect the status quo, without affecting it (which in my opinion is not possible).
It’s like the old saying “dress for the job you want, not the job you have”.
Social problems are messy and full of situations like this where people can reasonably disagree and have decent, good-faith rationales for both sides, and we lack the kind of evidence that allows us to have strong confidence in our guesses about what would help.
If you insist on calling a black person a "negro", despite its change in connotation over time, you are not being very nice.
If you train an AI, or a person, using old books to call someone a "negro", you're condoning and continuing offensive behavior.
Ditto, here.
> since the grievance grifting relies on always coming up with new issues,
We pretty clearly, culturally, have a whole lot of issues. Becoming more nuanced in how we label them makes sense. And, of course, language changes rapidly.
It especially changes rapidly when we're talking about marginalized groups. Pejoration is a process by which a word associated with marginalized groups become offensive over time. "Idiot", "moron", "retard" were all originally clinical and relatively non-offensive words, but society as a whole ended up changing them to include a value judgment. The euphemism treadmill is annoying, but insisting on continuing to call someone something that has developed a negative value judgment is not really good, either.
Again: "merely a refusal to return results in an area where the model has proven problematic."
> Your example has quietly shifted from facial recognition of different races to speech about different races.
Again, three points:
* First, no matter how you feel about gender: bias in AI is a problem, as evidenced by issues with recognizing black faces.
* Second, there's some obvious cases where we can all agree that using past training data could result in things that are currently offensive. There are pieces of language we pretty much all agree we should use differently now to avoid offense (e.g. mongoloid).
* Third, I believe that gender is one of these cases. Social mores are evolving. Using conventions from the past when our collective norms are changing on the span of months basically guarantees offense.
Given the variance in the utility of copilot's suggestions, this doesn't seem true on it's face. Define "effectively" here and I think cherry picking would definitely fall within its range.
Well, this clearly isn't the case in the DALL-E training dataset, because "medical doctor" overwhelmingly yields white dudes-- even after OpenAI's effort at removing bias.