Notes on AI Bias

146 points by andrevoget 7 years ago | 119 comments

twa927 7 years ago |

> Until about 2013, If you wanted to make a software system that could, say, recognise a cat in a photo, you would write logical steps. You’d make something that looked for edges in an image, and an eye detector, and a texture analyser for fur, and try to count legs, and so on, and you’d bolt them all together...

I'm doing a lot of such algorithms (well, not for images). Does someone know if such algorithms have a name? I'm calling it "heuristics" and I think it falls under "AI".

voidifremoved 7 years ago | |

A while ago, Google photos autogenerated a video for me from my photo library. It was about a minute long, stitched together dozens of photos, called "dog video", and with a horrifying yapping dog soundtrack.

Every single photo was of a cat.

I have to say I was humbled by the amount of human and computing power that had gone into developing this system over the years, that could achieve such a complicated, impressive technical feat, without requiring any effort or money on my part, and yet also be 100% wrong.

Bartweiss 7 years ago | | |

> be 100% wrong

This really is quite impressive. It's rare for humans to do worse than random guessing on tasks, and they almost never do much worse. There's something almost charming about the ability of AI to put real effort into actively avoiding correct answers.

jofer 7 years ago | |

For the specific example given there, I'd say it's most often called feature engineering. I'd also argue that it's a lot more necessary than most people think, but I'm probably just being stodgy and am biased by working in relatively narrow domains.

Calling it "feature engineering" implies it's still being fed into some sort of trained classifier to make the final decision, though.

What you're describing of your own work might better fall under the broad category of an "expert system".

piker 7 years ago | |

Expert systems? https://en.wikipedia.org/wiki/Expert_system

twa927 7 years ago | | |

I think expert systems consist of a "rule engine" where rules can be added dynamically?

microtherion 7 years ago | |

I kind of like the tongue in cheek moniker "GOFAI" (Good, Old-Fashioned AI), though that is applied more to symbolic AI https://www.cs.swarthmore.edu/~eroberts/cs91/projects/ethics...

AJRF 7 years ago | |

Maybe image segmentation? In my AI class it was referred to as image segmentation and edge detection (interchangeably)

https://en.wikipedia.org/wiki/Image_segmentation

chobeat 7 years ago | |

I call these approaches: "there must be OpenCV in there somewhere"

bhl 7 years ago | |

Heuristic algorithms using hand-crafted features.

frankbreetz 7 years ago | |

First order logic rule-based system

mv4 7 years ago | |

this is similar to bag-of-words models

https://en.wikipedia.org/wiki/Bag-of-words_model_in_computer...

layoutIfNeeded 7 years ago | |

I would call it “classical” machine learning.

twa927 7 years ago | | |

Hmm, I think there's no "machine learning" here. There's a human hard-coding some thought process, using mostly some simple statistics/thresholds to e.g. define what a "fur texture" looks like.

fvdessen 7 years ago |

> Since Amazon’s current employee base skews male, the examples of ‘successful hires’ also, mechanistically, skewed male and so, therefore, did this system’s selection of resumés. Amazon spotted this and the system was never put into production.

Couldn't they have retrained the system with a 50/50 mix of males / females resumes ? Or restrict the use of the algorithm to sort male resumes ? Or maybe resumes don't actually correlate at all with success in Amazon ...

gambler 7 years ago |

>The most obvious and immediately concerning place that this issue can be manifested is in human diversity.

I swear, when someone starts building autonomous killer robots, the first set of concerned articles will probably be asking whether robots were properly trained to target all genders and races with equal accuracy. This is not a sensible way to approach AI ethics.

>It was recently reported that Amazon had tried building a machine learning system to screen resumés for recruitment. Since Amazon’s current employee base skews male, the examples of ‘successful hires’ also, mechanistically, skewed male and so, therefore, did this system’s selection of resumés.

There is nothing "mechanistic" about this. It depends on how you select sample resumes and how you split them between "good" and "bad" labels.

I worked on a similar thing as an "encouraged" side-project at a certain company. Except I realized from day 1 that using AI on resumes is a bad idea and aimed to show this with data. My model was aiming to detect people who will quit or get fired within first 6 month (with the intent of lowering them in priority for interviews, supposedly). It miraculously achieved 85% accuracy... by figuring out how to detect summer interns.

Framing this problem as "bias" and especially hyper-focusing everyone's attention on diversity aspect of it is extremely irresponsible. (I'm not saying that's what the author is doing, but that's definitely what's being done at large.) Fundamentally, there are significant higher-level problems with using statistical ML models for things like hiring or crime prediction.

chobeat 7 years ago |

I've just added this post to my reading list. I share it if anybody is interested in this and similar topics: https://github.com/chobeat/awesome-critical-tech-reading-lis...

Zolomon 7 years ago |

There is a course on this at New York University: https://dataresponsibly.github.io/courses/spring19/

killjoywashere 7 years ago |

I actually think this is where ML really shines. You can pick things apart. Sure, you might need carefully designed experiments, but you can subtract "female" from the resume and look for other data that cause some trained machine to activate, like patterns of word choice, etc. This is akin to the Go players learning from Alpha Go. It's actually a richly rewarding investigation for those of us who have done it. To discover a whole class of failure modes, that's success! And, unlike courts of law, the the process is much more efficient, because you don't have to contend with a defendant appealing to matters of intent or the emotions of a jury.

Someone 7 years ago |

Short way to describe the problem: we want to build systems that detect causation, but statistical models can only detect correlation.

wongarsu 7 years ago | |

That's not entirely true: it's hard to show causation, but with enough data you can. If A correlates with B you know that either A causes B, B causes A, some C causes both A and B, or the correlation is a coincidence. If you have the data to rule out 3 of those the remaining possibility is the causation.

Someone 7 years ago | | |

So, how do you, for example, rule out “some C causes both A and B“, if you may not even know of the existence of C?

More importantly, the only way to really show causation is by positing a mechanism.

eanzenberg 7 years ago |

>>Now, suppose that 75% of the bad turbines use a Siemens sensor and only 12% of the good turbines use one (and suppose this has no connection to the failure). The system will build a model to spot turbines with Siemens sensors. Oops.

Given a statistically large enough sample, 2 outcomes: 1) The Siemens sensor actually is at fault. 2) The Siemens sensor is a part of a larger system, which is different in non-Siemens turbines, and that system is failing.

Either way, the model prediction on turbine failures is enhanced with that Siemens feature. But to even get to this granularity, you are diving into model explainability, or what features were important for each prediction. Here, you try to understand the black-box to find reasons for particular input->output.

jgon 7 years ago |

This quote stands out to me:

"just as a dog is much better at finding drugs than people, but you wouldn’t convict someone on a dog’s evidence. And dogs are much more intelligent than any machine learning."

Because in my head I followed it with the sentence "but we're all confident that we will have dogs driving our cars in about 5 years." Food for thought for sure.

dmix 7 years ago | |

So dogs are better than humans at detecting drugs because they have a better sense of smell than can penetrate packaging. What does that have to do with technology being better/worse than humans at driving, exactly?

They didn't say dogs were better than technology at solving problems, in any sort of general sense.