Google, Meta, others will have to explain algorithms under new EU legislation

Google, Meta, others will have to explain algorithms under new EU legislation(theverge.com)

360 points by niklasmtj 4 years ago | 211 comments

TavsiE9s 4 years ago |

That's going to be interesting to see if the whole credit scoring industry will have to disclose their algorithms as well.

Looking at you, Schufa.

miketery 4 years ago | |

My understanding regarding credit score is that it's one of the most regulated and explainable algorithms.

It's not transparent to the public, but it is auditable due to anti discrimination regulations. Am I wrong on this?

delusional 4 years ago | | |

We're doing credit scoring at the Danish bank i work at. One of our requirements is that the model and architecture has to be able to provide explanations for why yohr rating is whatever it is. Both to regulators, internal auditors, and customers.

Personally, i thimk denying people a loan is a pretty impactful decision on peoples life. They deserve a reason.

mike_d 4 years ago | | |

At least in the United States, anyone can create a credit score. There is VantageScore (the one you get for free from CreditKarma or your bank/credit card), at least 20 different versions of FICO, JSS Scorelogix, LexusNexus RiskView, Equifax FICO and RISK, ChexSystems Consumer Score, PRBC, and a million others.

You have a right as a consumer to the underlying data from the credit reporting bureaus, but not the proprietary algorithms that determine risk.

I understand the desire for transparency, but at its core credit scoring is fraud prevention. It is like asking Visa to explain what criteria they use to determine if a charge is fraud, which predominately helps the people trying to do the frauds.

TavsiE9s 4 years ago | | |

For one of them operating in Germany that’s not the case. They are - allegedly - taking the area you live in, marital status, etc. into your score and have so far refused to disclose their algorithm and methods.

jasfi 4 years ago |

If you read the article they're asking for more than explaining algorithms. Overall they want the the tech providers to be responsible.

Explaining algorithms could, in theory, give away a competitive advantage. However fairness to users seems to be a priority in this decision.

ethbr0 4 years ago | |

>> "Large online platforms like Facebook will have to make the working of their recommender algorithms (e.g. used for sorting content on the News Feed or suggesting TV shows on Netflix) transparent to users. Users should also be offered a recommender system “not based on profiling.”"

Both of those seem like good ideas and progress. The non-profiled recommender system option especially!

It's also really bothered me that tech companies of sufficient size can discriminate against legally-protected classes because "algorithms are complicated" and government regulators haven't pushed.

I'm not a fan of regulating design or use, but I'm a huge proponent of requiring transparency and detail on demand.

We'll see how willing the EU is to levy fines for breaches.

It's no doubt a consequence of most huge tech companies being American, but it's been refreshing to see the repeated "We have a law; You clearly broke it; Here's your fine" follow-through thus far from EU enforcement.

judge2020 4 years ago | | |

> It's also really bothered me that tech companies of sufficient size can discriminate against legally-protected classes because "algorithms are complicated" and government regulators haven't pushed.

Care to elaborate? Discrimination in terms of what ads are displayed perhaps?

dmitriid 4 years ago | | |

> We'll see how willing the EU is to levy fines for breaches.

It has been very slow with GDPR, I expect it to be even slower here.

otherotherchris 4 years ago | |

Things like "fairness" aren't defined in the legislation and will be determined in smoke filled rooms by shadowy moneyed interests.

Ordinary users will get censored. By the courts, by unelected regulators, and by Big Tech AI zealously nuking content to avoid arbitrary fines. It's content ID on steroids.

jasfi 4 years ago | | |

I agree that it could get out of hand. We'll have to wait and see how it turns out. Since this is an EU law I wonder if it applies to content hosted on EU servers only, or any content that shows up in their users' results.

omegalulw 4 years ago | |

I would love to first see a technical definition of fairness from EU that can be used to evaluate algorithms. That is a non-trivial detail often overlooked from these discussions.

bzxcvbn 4 years ago | | |

This is 2022, you have this information at your fingertips. https://eur-lex.europa.eu/legal-content/en/TXT/?uri=COM%3A20...

> Article 29 Recommender systems

> 1. Very large online platforms that use recommender systems shall set out in their terms and conditions, in a clear, accessible and easily comprehensible manner, the main parameters used in their recommender systems, as well as any options for the recipients of the service to modify or influence those main parameters that they may have made available, including at least one option which is not based on profiling, within the meaning of Article 4 (4) of Regulation (EU) 2016/679.

> 2. Where several options are available pursuant to paragraph 1, very large online platforms shall provide an easily accessible functionality on their online interface allowing the recipient of the service to select and to modify at any time their preferred option for each of the recommender systems that determines the relative order of information presented to them.

rtsil 4 years ago | | |

Not technical, but fairness is the opposite of "our algorithm is so complicated that we can't prevent it from penalizing you even if you are not at fault. Unless you reach the top of HN, in which case we will manually intervene to fix things."

vmception 4 years ago | |

Nevada Gaming Control Board requires source code of all the casino games

Easy to see this concept expanding

leksak 4 years ago | | |

This deserves more attention as it does set a good precedent

pmoriarty 4 years ago | |

"Explaining algorithms could, in theory, give away a competitive advantage."

Why should anyone care if they have a competitive advantage?

If anything I want them to have a disadvantage, lose money, and go out of business.

pwdisswordfish9 4 years ago | |

> Explaining algorithms could, in theory, give away a competitive advantage.

Which is good. We could use some more competition on the market.

Irishsteve 4 years ago |

The new regulation requires a hand-wavey style explanation i.e build a retrieval / ranking / matching algorithm that learns from customer clicks and considers blah blah.

There will be no explanation of the actual algorithm.

rm_-rf_slash 4 years ago | |

Probability ranking principle will never not be the foundation for recommender systems. Any statistical model flowing from PRP will be more or less the same regardless of whether the architecture is neural, boosted trees, etc.

However, if regulation required companies to disclose all of the data goes into those models, how they acquire it (tracking browser/app behavior, purchase from 3rd parties), and so on, that would be the real game changer for consumer privacy and protection.

postsantum 4 years ago | |

No explanation - no market. If you don't like it, you can run your business in another contract jurisdiction. I hope these companies will taste their own medicine, we desperately need competition

johnny22 4 years ago | | |

the person you're replying to is saying that their explanation will be trivial and thus useless in determining the responses.

I don't know if that's true or not myself though, since I haven't read myself.

trevortheblack 4 years ago |

Six percent annual turnover for non-compliance seems too low.

Should be six percent for first offense, 12% for second, 25% for third, etc.

Until the company fixes it's compliance or becomes insolvent.

icedchocolate 4 years ago | |

This opinion is terrifying to me. Why has our culture become so authoritarian?

yifanl 4 years ago | | |

The alternative is that corporations can (or rather, can continue to) factor in selectively breaking laws as a line item in their operating costs, which is unappealing for certain people for a variety of reasons.

hiptobecubic 4 years ago | | |

Escalating punishment based on repeat offense until we find a level that is convincing enough to stop you from offending is not a particularly new idea.

account42 4 years ago | | |

If an individual continues to defy the law they get sent to jail. Why should corporations get away without similar escalating consequences?

sumedh 4 years ago | | |

because companies knowingly violate laws then get a slap on the wrist when they get caught and no one goes to jail when they are caught.

doh 4 years ago | |

Fines are fine balance between motivational and liquidating.

I think 6% is quite a lot, even if one has 40% margin. Investors will be highly distraught and seek remedies from the current management. But for instance at 20% they will blame the regulators and push the company to fight in courts.

In any way, government wants to motivate change in behavior not taking companies out of business.

account42 4 years ago | | |

> In any way, government wants to motivate change in behavior not taking companies out of business.

Taking maliciously noncompliant comanies out of business can be a way to motivate others to not try to skirt the law.

jdrc 4 years ago |

Requiring an alternative to algorithmic sorting (chronological) is good even though most sites do it already. "Explaining the algorithms" sounds like an impossible-to-implement, feel-good clause.

Requiring transparency for bans and censorship though will probably have a major effect if people start asking nosy questions and exposing corporate and government abuses of power. Many EU governments will regret that users can expose them , that will be fun to watch. It will also make it very hard for companies like reddit to function: could reddit be legally liable for actions of its moderators?

the other clauses are the typical wishful thinking by EU legislators who think that you can legislate the solution to unsolved or unsolvable tech problems

3a2d29 4 years ago | |

To add to this, I wonder how "explaining the algorithm" will work with algorithms that are trained with ML. Essentially they are black boxes, right? So as a tech company, would I have to just say what my best guess is on how it works?

deadlocked 4 years ago | | |

As a tech company you know what outcome you want your ML algorithm to produce and you presumably have some way of figuring out whether or not it’s producing that outcome. Presumably you also know what’s being fed to the ML algorithm as training material.

poisonborz 4 years ago | | |

I guess you always have a set of weighs that the system tries to work towards to.

jdrc 4 years ago | | |

"gradient descent" should be enough

tjbiddle 4 years ago |

> “Dark patterns” — confusing or deceptive user interfaces designed to steer users into making certain choices — will be prohibited. The EU says that, as a rule, cancelling subscriptions should be as easy as signing up for them.

This is an excellent addition.

MeteorMarc 4 years ago |

It will be fun to see Google's algorithm for ranking search results.

otherotherchris 4 years ago | |

"Here's 100PiB of unlabeled neural net weights. Knock yourselves out."

simion314 4 years ago | | |

>"Here's 100PiB of unlabeled neural net weights. Knock yourselves out."

You need to give the user an explanation on why you blocked his account, but if Google is kind enough to add on top the secret neural network then some people would be happy to have a look at it and find even more garbage in it.

jasfi 4 years ago | | |

They want to know how the algorithms work, not the data itself.

thedeadfish 4 years ago | | |

Google manually adjusts its results for censorship reasons. This is probably why google has gotten so much worse, they don't want information to be freely accessible, they only want things they approve of to be seen.

frereubu 4 years ago |

IANAL, so happy to be corrected, but my understanding is that EU and US law work in quite different ways. EU law sets general rules, and law courts decide what that means with reference to existing legal precendents. US law is very, very specific about what each clause means and how it should be interpreted.

Every time I see these kinds of discussions I wonder if quite a few of the disagreements are due to e.g. US commenters worried by the relative lack of specific details.

krastanov 4 years ago | |

Did you flip EU and US in your comment? My understanding is the exact opposite of what you wrote:

- US, common law, https://en.wikipedia.org/wiki/Common_law

- EU, civil law, https://en.wikipedia.org/wiki/Civil_law_(legal_system)

Citing: Civil law is a legal system originating in mainland Europe and adopted in much of the world. The civil law system is intellectualized within the framework of Roman law, and with core principles codified into a referable system, which serves as the primary source of law. The civil law system is often contrasted with the common law system, which originated in medieval England, whose intellectual framework historically came from uncodified judge-made case law, and gives precedential authority to prior court decisions.

HWR_14 4 years ago | |

The US has federal law that apply to all the states. The EU has binding resolutions (the general rules you mentioned), and then each nation passes its own implementation. It's similar to the US federal RealID act which set standards for licenses that the states could implement however they wanted.

Mentlo 4 years ago |

Before people get overly excited about this - it will be very important to see how exactly it's worded in the legislation itself.

Anti-discrimination legislation has already made black-box algorithms illegal if they are deciding on anything that a user might take objection to - so for most use cases this is not a big change.

As for - the recommender systems will have to not be based on profiling - unless we're talking about removing recommender systems based on data altogether - it will be interesting to see what the legislation considers profiling. If I tie your recommendations to the last viewed piece of content (content contextual recommendation), is that profiling? It's arguably worse for the user and for society more than profiling recommendation. If the recommendations are based on your explicit categories is that not profiling? Yet it's the principle used in news aggregators for the last 30 years.

The wording is going to be important here.

arnvald 4 years ago |

> The greater the size, the greater the responsibilities of online platforms

> as a rule, cancelling subscriptions should be as easy as signing up for them

Overall I like these principles, but we'll see in a few years how they're enforced in practice. It's been 4-5 years since we've had GDPR and I still see sites that require tens of clicks to disable all advertising cookies (and the most I've seen was 300+ clicks). Even Google only this week announced they'll add "reject all" button to their cookie banners.

I expect it'll be similar in this case, companies will do bare minimum to try to stay compliant with the regulation, and it will take a few years to see real differences, but I hope it's at least a step in the right direction.

mijamo 4 years ago | |

4-5 years is nothing for the law. You have murders from 15 years ago still in process in courts. But eventually things settle. It just takes time. It's a bit like ents

FollowingTheDao 4 years ago | |

> as a rule, cancelling subscriptions should be as easy as signing up for them

Before I sign out for any service this is the first thing I check.

vampiretooth1 4 years ago |

What actually constitutes a full explanation of the algorithm? Article doesn't get into this enough, it mentions a high level overview is required but not much else. I can imagine that it's not going to require sharing the codebase or IP, of course.

swayvil 4 years ago |

I see a vast technical writing documentation project in their future.

jdrc 4 years ago | |

But they must be short and easy to understand by users. Like this:

"Our algorithms use gradient descent. Data flows through our connected tubes, slowly wiggling their size until the data starts flows back and forth faster."

swayvil 4 years ago | | |

That's quite awesome. It's like Dr Seuss.

dontblink 4 years ago |

Regarding the rules surrounding fake information, I wonder why the be EU hasn't taken a similar stance against Fox News equivalencies?

dehrmann 4 years ago | |

Rules around fake information scare me because they're a limit on speech, and as Russia has recently shown, fake information is anything a dictator doesn't approve of.

Broken_Hippo 4 years ago | | |

Russia has recently shown, fake information is anything a dictator doesn't approve of.

This isn't an issue of "limits on speech", but rather, another reminder that one shouldn't enable folks to become dictators. Not having some reasonable limits on actual misinformation makes us all less free, however, because we cannot put our trust in some organizations.

walkhour 4 years ago | |

Because we wouldn't have any media whatsoever to consume.

FpUser 4 years ago |

I am curious how would they "explain" AI algorithms where it is impossible to explain how / why the decision has been made.

kazamaloo 4 years ago | |

Not that the algorithms are impossible to explain but in some cases the real explanations might require explanations, too. But I think companies will probably get away with hand-wavy explanations like you get this recommendation because you watched this movie neglecting all the sourcing/ranking/filtering workflows.

jumpifzero 4 years ago |

In principle looks good but lots of potential for going wrong.

Just hope this doesn't backfire. The cookie law was also a thing the EU created with good intentions after some politicians decided "omg cookies are bad" and we ended up still using cookies but pop-ups in every single website basically forcing you to accept the use of cookies.

FollowingTheDao 4 years ago |

"They have electrolytes."

neatze 4 years ago |

If I had to guess it probably similar to SEC or NYSE required explanation when you do suspicious trades.

narrator 4 years ago |

"We run it through our deep learning model. Here's 50 gigabytes of neural net weights."

mnau 4 years ago | |

> Here's 50 gigabytes of neural net weights.

.. from few months ago. Weights change daily, most likely updated by another NN.

I guess it's nice that lawmakers understand that at some point these companies used algorithms to search or sort stuff, but industry has already moved to another level. We might be able to explain specific result of neural networks (Shapely values or something like that), but the actual algorithm (=NN)... no way.

Vespasian 4 years ago | | |

Well then they can explain things like what inputs to they are using (with concrete examples four each user), what metrics they are optimizing their NNs for, how their product success is measured, what their internal research is focused on etc.

I feel a lot of people on HN are looking at this from a technical standpoint while lawmakers are more interested in how these companies plan and position themselves. Explain how they "maximize profits and shareholder value" would be more accurate in my opinion

throwaway4323r 4 years ago |

unless you can reproduce this is not going to cut. I dont think this is going to help. It only creates more useless software jobs

anothernewdude 4 years ago |

"Linear Algebra"

LightG 4 years ago |

Good. More.

wave-creator 4 years ago |

Do you think the EU will enforce the law for non-US and non-EU companies like TikTok will disclose them? It will be interesting to see if they will uphold the law equally to all.

manquer 4 years ago | |

All the big guys roughly know what kind of pipelines every one has , they hire from each other etc.

The level of disclosure is not going to break a lot of competitive advantage.

basically need to say what input sources and feedback they use and modular blocks on what different steps go into the pipe, nobody is asking them to expose the actual weights of billion parameter ml model they all probably have .

Even if hypothetically they did expose that level of detail it is useless for regulators as they don’t have resources to run the model , and testing a model for side effects in depth is hard .

somewhereoutth 4 years ago |

A good start. However let's go further, simply ban personal tracking and personalized algorithmic feeds. This would combat the echo chamber effect and social media can become a broad community experience, like TV and newspapers. It would also cripple tech advertising revenues, thus redressing the balance with traditional media.

ben_w 4 years ago | |

I’ve seen the difference between what YouTube presents to me when I’m logged in v.s. when I’m on a clean computer it can’t associate with me, and I do value the personalisation — when not logged in it shows me a hundred duds for every one thing I care about, and logged in it’s about 50:50.

How much of this improvement is a mysterious machine learning algorithm and how much is it just looking for new things from my subscription list, I’m not sure, and that’s important: being trapped in a torrent of self-reinforcing falsehoods is something I fell for in my teenage-goth-New-Age phase, which Carl Sagen condemned in The Demon-Haunted World, and which people in general have been falling for with every sychophant and propagandist from soothsayers to tabloids telling them what they want to be so.

PolygonSheep 4 years ago | | |

> I’m not sure, and that’s important: being trapped in a torrent of self-reinforcing falsehoods is something I fell for in my teenage-goth-New-Age phase, which Carl Sagen condemned in The Demon-Haunted World.

Genuinely curious here: how can you tell you've escaped one set of self-reinforcing falsehoods while being sure you haven't fallen into another, different set?

EnderShadow8 4 years ago | | |

An alternative might be for personalisation to be opt-in rather than opt-out even when signed in with an account (which shouldn't even be necessary for many services anyway)

hairofadog 4 years ago | |

One of the things that frustrates me about discussions of censorship here on HN is that there’s a lot of intense focus on censorship via deleting a tweet or Facebook post, but no focus given to the more insidious problem of censorship by algorithm.

I am wholeheartedly in favor of a free marketplace of ideas where (we would hope) good ideas win out over bad, but as it is, once you’re deemed by an algorithm to be susceptible to a certain category of extremist information, that’s all you’re ever going to see again; the competing ideas are never going to have a chance.

Algorithmic distribution of ideas is sorta like distributing ideas via gasoline-powered leaf blower directly to the face. I am free to speak my competing ideas, and so technically I haven’t been censored, but no audience is going to hear me over the leaf blower.

visarga 4 years ago | | |

We need to get some level of control over the criteria for ranking and filtering. A third one is the UI - it is the place where all sort of dark patterns hide.

I'd like to see the browser put in a sandbox and its inputs/outputs sanitised and de-biased before being presented to the user. Could also protect privacy more. We need more browser innovation. A neural net should be in every browser ready to apply semantic rules.

pessimizer 4 years ago | |

I don't think people should be forced into the public square by law. If you want to live in an echo chamber, you should be able to. We don't forcibly close convents. If I want to choose the "Smart Feed™, although I can choose not to, that should be my choice to make.

I don't know TikTok, but people seem to like its choices.

somewhereoutth 4 years ago | | |

I think the distinction is that generally it should be a conscious choice to be in the echo chamber - and not the easy unknowing default choice (if you even have a choice) for smart feeds.

tester89 4 years ago | |

> personalized algorithmic feeds

But there are good uses, like for music. I can’t really think of a downside for music tbh, it’s not like music tends to spread extremism, and on the upside lesser known artists have a better shot at being discovered through the algorithm.

vimsee 4 years ago | |

I usually have few reasons for being concerned about the future.

I think wars (even with the on-going war that Russia started), climate issues (even with the high consumption present today) and poverty (even with many countries still in it) will all have a trend of declining. However, this echo chamber fueled with miss-information is one of the things I care for.

I am so happy the EU has power and will to make good changes that gives mutual benefit to everyone when other parts of the world does not.

cush 4 years ago | |

I think it's fair to use personal data collected on the same site. Without it, most sites would be rendered useless.

somewhereoutth 4 years ago | | |

Indeed - that would be a legitimate use of cookies etc. In fact, if that were enforced, we can get rid of the annoying cookie warnings.

BeFlatXIII 4 years ago | |

How does a ban on personalized algorithmic feeds work if each user is subscribed to a different set of others?

somewhereoutth 4 years ago | | |

"news from friends" might be ok if it is presented without algorithmic curation - i.e. strictly on time order or similar.

marcinzm 4 years ago | |

So should I be prevented from following only anti-capitalistic people on twitter or only following right wing subreddits? Should people also be banned from subscribing only to a single newspaper versus a mix of newspapers with different political leanings? What about looking at only socialist web pages that link to other such web pages. Should web pages we forced to link to pages with other political leanings?

somewhereoutth 4 years ago | | |

No, that's not what I meant. More that it should be a conscious decision to knowingly consume content with a particular bias - as you do when you pick up a certain newspaper or turn over to a specific tv channel, as opposed to being algorithmically presented with a stream of content that may veer in a direction without you being aware.

ernirulez 4 years ago |

EU is becoming more like an authoritarian state. They put constraints on companies but allow governments to have full control and surveillance over their citizens. It's so hopocrit

Epa095 4 years ago | |

I don't see how it's hypocritical to give democratically elected governments different possibilities that random private companies.

It's government's job to put constraints on companies, stopping them from becoming the absolute assholes they become if they have no limitations. That does not make them authoritarian.

netizen-936824 4 years ago | | |

Some people are sad that they can't set up businesses which exploit the populace as easily in the US I guess?

mediascreen 4 years ago | | |

Sure. On the other hand you can usually chose which companies to interact with. When it comes to governments the relationship is not optional. Your government usually have more ways to affekt your life than most private companies has (like no fly lists).

A few years ago the true extent of the Swedish program for tracking left wing sympathisers became known. It ran from the sixties up until 1998. For example, if your car was seen outside of a left wing publication you could end up on a list somewhere. That caused you to be automatically excluded from 5-10% of all jobs without you never finding out about it until 20-30 years afterwards. Imagine wanting to become a police officer, a pilot or and engineer and never understanding that the reason you didn't get an interview was because you had parked in the wrong spot one day years before. Or that your sister briefly dated a left wing journalist at some point.

rafale 4 years ago | | |

The tyranny of majority is a real threat. Power should be "shared" (or under contention). Companies just want to take your cash, governments can take your freedom.