twitter/the-algorithm(github.com) |
twitter/the-algorithm(github.com) |
That seems... bizarre to me?
* Chronological - reverse sort by date
* Home - for all of the followed topics, recommended topics, retweets and tweets in the past day determine the estimated level of engagement, include the highest and reverse sort by date. This is likely to be a fairly basic ML model.
It will be uncontroversial, technically unsophisticated and of no practical use to anyone - users, developers or researchers.
This is not going to be PageRank where some genuine new insight was discovered.
I've built hundreds of models and run a ML company and I don't believe it's technically possible for this rule not to be the case.
I imagine they'd probably start with documentation and white-papers that communicate "here's how we intend for it to work".
It's seriously unlikely anyone in Twitter knows actually works how any non-trivial algorithm in the company works. To figure THAT out, they could decide to do a company-wide documentation and instrumentation push like they probably would've had to do for GDPR anyway, which is painful and boring and going to take a very long time.
Failing that, they could just say 'the algorithm as it stands is no longer fit for purpose, given part of its core requirement has become that it needs to be transparent and publishable, and presumably legible. We need to make a new one. Publish the core algorithm. We probably won't deploy it in that exact state, it's going to span multi-services and so on, you obviously don't get the data we used to train the models, but we will work backwards from it and here's an open mechanism to measure how true-to-form it actually is'
I would say the algo itself is worthless, but my estimate would be nowhere near even 1B$.
if twitter is a game, sinking $43bn into it is kinda like winning or losing the grand final boss level. (unclear which)
wish elon would get back to facilitating the building of useful things. we still don't have a great clean energy generation story.
By the way, in the USA there is no legal support for the term hate speech. It is covered under the umbrella of the first amendment.
https://mobile.twitter.com/pewdiepie
Hate speech doesn’t need legal protection as a term to be against the rules of the site. I can’t remember if that’s the actual term twitter uses, but it’s the spirit of their rules.
We could have a wider debate about whether our public square should be privately owned (I don’t think it should) but that wasn’t your point.
Edit: more details and reason why his tweets are (self)deleted
I never came close to saying they don’t have the right to restrict speech anyway they wish
[0]: https://twitter.com/willnorris/status/1518694675909013504
First of all, "the algorithm" is probably hundreds of thousands of lines of code, including all the tedious boilerplate like cache policies and multi-AZ logic.
And second of all, doesn't the algorithm include machine learning components, which are trained on terabytes of data? That data will likely be impossible to open source. And open sourcing the neural nets without the training data is mostly meaningless from a transparency perspective?
That's an interesting point. A practical description of the algorithm from the perspective of someone trying to game it may be more useful than anything Twitter or Google would release.
Gesture carries weight to the users too.
Not sure any big company has tried this before. I could be wrong, but either way looking forward to it / FWIW hope it catches on.
The point of releasing it is to let people know exactly why they see the tweets they do in the order they do. I hope Elon just goes back to time base ordering of tweets.
Even the people who build these systems barely know what the algorithm is going to do, much less why. It will be a herculean task to try and convey that to an average user.
And developers will be able to train a model using it on a subset of Twitter data. Just that the quality of the outcome won't be the same as having the full set of Twitter data.
At the minimum, I would make a private Github repo first, add all relevant commits, and then make it public once there's actually content.
Either this is a mistake, or this is a really, really misguided attempt at a joke from Twitter.
https://twitter.com/willnorris/status/1518694675909013504
Which seems like a promise they intend to actually open source something there.
So if it was a WIP, it'd be a private repo until it's ready to release publicly.
Even if we were to open source all associated code and publish all related documents it would be very difficult to make sense of the entire system. That is precisely why companies such as Twitter A/B test the hell out of everything. What most people think of as "the algorithm" is a complex system that receives many inputs (maybe hundreds) and has dependencies on many other internal Twitter services. Tweets likely pass through multiple filtering steps as well as scoring before you ever see them. Each of these steps is highly contextual, depending on: location, past tweets, verification status, etc. You can attempt to predict the effect of a certain change, but you never know the actual outcome until you test it.
I think what will ultimately happen is that _some_ details will be published. Elon will parade that around as a victory for free speech as Twitter is now more "open". In reality, nothing of value will be gained as "the algorithm" isn't a simple function.
What could a rogue employee do?
I was actually wondering some people may want to remove traces of what they have been doing.
I wish someday we can see the internal communications lead to the Hunter Biden laptop story ban.
And having been in a company that was taken over, it's a mixture of emotions - is my job safe, will this be the same culture I joined for etc. etc.
This is interesting question since RSUs are a big part of total comp but how are unvested RSUs dealt with when the stock is retired? Are those put on a future cash comp schedule? And if so at what conversion rate?
error forking repo: HTTP 403: The repository exists, but it contains no Git content. Empty repositories cannot be forked. (https://api.github.com/repos/twitter/the-algorithm/forks)
My thoughts:
- Explicit rules for temporary and permanent bans
- Edit button
- More fun and thoughtful conversations like HN
- Less thought bubble Brooklyn based reporters, less VC and side grind hustle snake oil, maybe more comedians and memes?
Twitter's EU user base is probably [3] above the 45 million threshold that triggers the strictest transparency requirements under the Act. So perhaps they figure if they're going to be forced to disclose anyway, they might as well do it proactively.
[1] If it's even coherent to talk about their feed ranking system as a single algorithm — see the other comments in this thread.
[2] https://www.theverge.com/2022/4/23/23036976/eu-digital-servi...
[3] https://www.statista.com/statistics/242606/number-of-active-...
So not a troll, but yes it is odd to put up an empty repo, and announce the repo before there is anything in it.
That doesn’t mean it’s a joke, I see it as a show of goodwill — that there are a handful of people inside Twitter that are excited for transparency and for a revenue model that isn’t entirely based on ads, that are excited to get to work on this right away.
So I'm not sure what the ultimate point of this exercise is other than producing faux-transparency.
I think only if you offer twitter users the level of first amendment protection they'd expect with a government body. Otherwise reporting to congress would be an a bold faced circumvention the first amendment. Twitter is a privately held company with no need to report to congress.
There is great opportunity to abuse this by Twitter, yes. There is also a lot of money to be made. But in defense of some of that being secret, is the fact that any publicly known ruleset (with no hidden exceptions) _will_ be exploited by bad actors. Imagine if search engines told spam sites exactly why their site dropped in page rankings.
Elon polled Twitter users about this and the response was overwhelmingly in favor of open source and transparency. Everyone on Twitter got a vote.
If you oppose transparency, as many now are, you lose your credibility. So it’s another one of Elon’s people hacks, and look at all the morons falling for it.
Like, there's no public admission right now of whether "shadow banning" or "ghost banning" is even officially a thing!
Some transparency seems unquestionably more powerful than none, and we can work from there.
Maybe that is where it is going.
That's an algorithm.
I mean even if timelines were totally random, or based on some external facts, there is an algorithm that is being used to order them.
This isn't just an academic distinction. Claiming 'there is no algorithm' because the algorithm is intentionally or unintentionally obfuscated or complicated has implications if that claim of 'no algorithm' is accepted. If my algorithm for approving mortgage applications is explicitly racist, I can just spread it's functionality across myriad services owned by lots of teams, make it almost impossible to figure out how it works, and then avoid any responsibility by saying 'there is no algorithm to decide loan approvals'? That would be bullshit!
And does every user have their own algorithm?
And could it be made readable to a human?
ORDER BY timestamp DESC;- Search results
- Comment Order
- Timeline Order
- Trends
- Human vs code
Personalization in general. Big gigantic “why” when it happens to you
You know what? This does demonstrate the internal problems inside Twitter and shows the need for shakeup.
There is no "no algorithm."
It's an interchangeable function, it would only be publicized if it's clear to leadership that it wouldn't affect their revenue if people started trying to game towards the published algorithm.
But there definitely is a relationship algo that could be considered theirs, like all social medias inflating the bubbles users all feel.
There is typically clear objective function of a recommendation system.
What Twitter is optimizing for is what’s of interest here. And some of the hidden business rules. It’s likely these are specified in the code in an obvious way.
How exactly they achieve that is the part that is complex and relatively indecipherable.
It’s possible that it’s designed in such a way the optimization objectives are also unclear, but that would indicate a bad design and be to the detriment of the company and users.
Many complicated research papers have had no issues describing their models at a high level. This should be no different.
My point is that the devil is in the details and implementation. These details are likely something that no one person understands and no one person is able to fix. The concept of being able to extract "the algorithm", factor it out from the codebase and share it with the public doesn't make sense to me. It won't be possible to fully understand how Twitter serves recommendations and ranks posts without understanding how all the different services at Twitter interact. Are they planning on open sourcing all of Twitter? Highly doubtful.
All these feed rankings are complex combinations of features, models coupled with weights and filters. On top of this abuse detection layers are added.
Unless Musk is planning to open source user data to show what all the "scores" and "features" for all the entities are and how they were reached to, this will make no sense. The whole argument against some people being downranked has been, why me? Just writing a whitepaper to tell the general methodology, is not going to make that go away.
On top of that, exposing every vector through which you measure and stop abuse, will just allow for more sophisticated abuse.
Twitter's been pretty transparent in how it "deranks" certain accounts [1]. What more would come from opening the code that certainly not include the actual database of "no no terms" (if you were to believe that exists)?
[1] https://blog.twitter.com/en_us/topics/company/2022/our-ongoi...
[0]: https://mashable.com/article/eu-digital-services-act-big-tec...
No one said that. You created a straw man and are arguing with it.
This comment says more about you than you think.
Open sourcing algorithm or code is not about everyone go and analyze the same, instead when controversy or issues arise it'll be readily available for independent experts to review it.
We can think of the main interaction as being a query which is an RPC payload. The contents contain the user request and a wide amount of other context (either referenced by a collection of keys like cookies, or materialized like fields that specify the user's age) and the response is a web page which contains sections (the web search response to the query, as well as the ads; either these could be rendered to two different frames, or interspersed, by the result presentation engine).
That query -> frontend translates into a tree or a graph of requests which collect up various bits of contextual data required to satisfy the query. For example, the query terms might be rewritten slightly and then sent to a web search backend which searches/ranks documents and returns the top matching documents on the organic web, or sent to an ads backend that returns the top matching bidders for those query terms. Again, just RPC/responce, although the actual context that the frontend and backend systems are dealing with, and use to modify the result, are truly enormous.
Each of those backend systems itself was produced with an enormous amount of data processing and contextual data that is available at serving time. All of this is implemented using various algorithms; everything from the TCP algorithms that manage bandwidth to the neural networks doing inference on the joint product of the user context and the query context and the ad context, and the logging system that writes the queries and their clicks to centralized storage for more ML training.
In theory though you could set up a system that compiled the full web stack, and ran the end to end of a user query, dumping all the intermediate RPCs, etc, from a modestly sized instantiation of the production system. and people could sit down and inspection what terms affected query result order, or which pages were omitted at which part of the filtering, or what data was logged.
It would be hell for a team to maintain and keep up to date wrt the production system, but many folks do this any way to have a simple version of the system around so they can make quick changes and see if it breaks part of the complex system without doing a full deployment.
It can be pseudo-code or diagram or whatever that can be used to understand what logic lies behind decision making.
There are ways to translating trained ML models and associated systems into understandable hierarchical rules.
Twitter's timeline is NOT AGI.
In fact, a lot of people here really think what people are talking about is the equivalent of what is handled in a subroutine.
No, what people are talking about when they talk about "the algorithm" is anything affecting the result set they're reading. Concepts like eventual consistency and edge computing are... well... a part of a model which laypeople, and even reasonably technical people call an "algorithm."
Being pedantic about whether or not this happens in an SQL query, or across multiple codebases, or by region, doesn't escape the question.
> Being pedantic about whether or not this happens in an SQL query, or across multiple codebases, or by region, doesn't escape the question.
Actually, epistemic ~"muddying of the waters" is a well proven technique to control perceptions and public discourse. If it works on HN folks, I expect it would work much more easily on amateurs.
I say this as someone whose political views, if you force them onto the left-right spectrum, probably end up about 80% toward the left. E.g. I've spent millions over the past several elections supporting the Democrats.
It used to be that censorship was something the right did, and free speech was something the left were in favor of. But over the last few decades, banning "problematic" ideas has become a huge component of left culture (http://paulgraham.com/heresy.html).
Plus tech companies in general, and especially Twitter, lean to the left. Imagine walking around Twitter pre-Covid. You'd find plenty of openly far-left employees. How many openly far-right employees would you find? I don't think you'd find any.
The combination of (a) the left's recent focus on banning heretical ideas, (b) the leftward lean of tech companies generally, and (c) the leftward lean of Twitter even among tech companies, means that right-wing speech is much more likely to get banned on Twitter than left.
That's why people on the far right keep starting lame Twitter alternatives. You don't see people on the far left doing that. They don't need to. They have Twitter."
Even if that's precisely true, is it not good to be creating a more trusted space for everyone? The grievances, regardless of merit, are mostly coming from the right. If you want to create a service that caters to all you're going to have to address their concerns. If he can do that in a way that is fair to all, it sounds like a win to me.
https://amp.usatoday.com/amp/1248099002
https://www.vox.com/2017/6/27/15878980/europe-fine-google-an...
https://i.imgur.com/MVlshAT.png
You don't have to be conservative to see there's a pretty significant bias, just in the headlines. I'm a Pacific Green and I can still see it.
It may not have been algorithmic, but it definitely happened.
Whatever. He paid for it. Private company. Do what it wants.
> it would be very difficult to make sense of the entire system
No. Not buying that. Difficult isn't the same as impossible and, if only to game the system (harder,) people will figure it out. And even if it isn't 100% possible to reproduce the results based on what is released significant insights will still emerge.
Further, there is some ceiling on the complexity. Twitter operates at scale and that means they can't actually burn 52kWh of power for every tweet or store TiBs of metadata for every user to do the analysis or take 30 minutes to publish. Likely it's a pretty efficient system and, therefore, limited in complexity.
I think there is a place for a smarter algorithm than "ORDER BY date DESC", but one that is not designed to manipulate users into addiction.
even when following too many to read everything, i preferred chrono because it would yield a coherent slice of what was happening. an unbiased sample.
twitter is basically a medium for conversation.
imagine there's a large party. would you rather listen to an out-of-order "most important" set parts of the conversation, or just a slice of conversation from a particular time?
well, actually, both can be interesting, but generally the slice is more coherent. :-)
1. Insertion of tweet to tweets table.
2. Insertion of that tweet-id to the home timelines of all that user's followers.
3. Insertion of that tweet-id to the user-timeline of that user.
On the read-path, if I'm not mistaken, the only join that happen is between the requested timeline and the tweets table (which is replicated across cluster of machines but not partitioned, or at least I remember reading that was the case not many years ago)
For about a week they made a change that prevent that chronological timeline from being the default, but they reasonably quickly rolled that back. https://www.theverge.com/2022/3/14/22977782/twitter-default-...
I then use tweet deck which shows a column of tweets per list.
As these are separated by subject and are chronological, it makes it far easier to follow.
In 2020 hackers who gained access to Twitter admin tools that were - at the time - apparently accessible to thousands of employees were able to use that access to compromise multiple verified (blue check) Twitter accounts including Jeff Bezos, Barack Obama and Bill Gates. They used it to promote a lame Bitcoin scam.
In the aftermath of that things were supposedly tightened up a little but don’t doubt a Twitter employee could do some damage.
- Rogue Twitter Employee Briefly Shuts Down Trump's Account [0]
[0] https://www.nytimes.com/2017/11/02/us/politics/trump-twitter...
The engineer from the OSS team at Twitter linked to it and said "watch this space"
https://twitter.com/willnorris/status/1518694675909013504
Sure I guess it could be that engineer gearing up for a joke, but I think it's more likely that it's a real release.
If it affects results in the slightest, I think that's what people are asking about.
Yeah, if the new senior dev screwed up a deployment and now football fans in Dallas are suddenly getting fewer posts about the most recent athlete scandal all because a regional deployment was the only thing doing relevance record keeping, then yes, that's a part of "the algorithm."
Maybe to you. You are entitled to your opinion. I won't censor you for having a contrary opinion.
> owed a fair ranking
Well we are getting it whether you like it or not.
> because no singular definition of fair
Of course there is. If New York Post's tweet about Hunter Biden's story gets blocked, I would like to see Taylor Lorenz's story about LibsofTikTok get blocked in a "fair" system. Since both are doxxing after all.
So why is that impossible? Mastodon exists. Is Twitter engineering so horrible that they are orders of magnitude worse than Mastodon's engineer quality?
It's also true that these lists are only a small part of what affects your fringe website or twitter accounts.
Like, say I drum up a conspiracy about how the government is putting Glupkleins in the water. Since it's not a real word, the only results that'll show up will be the conspiracy nonsense that I myself am peddling. People who "do their own research" on Glupkleins by punching it into a search engine will come away with the impression that the entire community is unified on whatever stance I want purely because nobody else knows or cares enough about it enough to write their own articles debunking it.
This is the same thing here, just at a murkier scale. Nobody uses the term big tech except in this context, and so using that term to find something which disproves the context around the term is a losing battle
If it were objective, you'd think it would try and suggest something for all queries just to be maximally useful. But it doesn't. I mean surely people have asked questions that start with "are white people" why not show the most common one, or the most controversial, or the "highest quality" whatever that is. But no, absolutely nothing.
Someone clearly has their thumb on the scale.
>religion
Nah, this one is super obvious in the lack of an Easter doodle, but every other religion gets a shoutout on their important days. Just like the lack of one for International Men's day. Google's bias is as predictable as every other (D-CA).
How? What specifically opens them up to litigation? It is a private company. It can do anything it wants as long as it is within the bounds of law. They have every right to ban anyone. Even on flimsy grounds. The idea here is to expose all the moral wrongs and bring it to the fore. Not that they are legally in the wrong (most cases they aren't). An audit is a good place to start.
Section 230, in brief, only provides immunity for social media providers from being responsible for the content that is posted by users on their platform. It has nothing to do with internal company policies. Rather, Section 230 actually enables internet companies to moderate content through the Good Samaritan protection.
So, it is actually worse that Twitter was moderating content in majority countries (except USA) where Section 230 wasn't even recognized. Revealing what exactly happened behind the scenes would not be a sufficient reason for litigation.
When Twitter was pulled up in India for its opaque moderation policies, it tried to quote American laws for its defense. In fact, when Jan 6th happened in USA, Twitter was quick to ban accounts of those who took part in the protests. 70,000 accounts were purged from Twitter. However, Twitter refused to ban accounts of those Khalistani terrorists who vandalized the Red Fort on Jan 26th in India and only partially complied with Government of India's orders. This double standard was visible to majority of Indians. So it is not like Twitter actually follows the laws set by the Country it is operating in either. There are multiple instances where Twitter has refused to follow directions by Government of India or by the Courts in India. You can read more about it here: https://www.hindustantimes.com/india-news/have-to-follow-ind...
So yes, moderation policies cannot be opaque. It has to be transparent. The reason for transparency is so that we know exactly why and for what reason was an account banned/shadow banned. If the Government of India sends a legal request for take down of accounts, it has to be complied with. Twitter cannot decide to invoke US laws in India.
Also, read this to understand more about this issue: https://techcrunch.com/2021/08/10/twitter-now-in-compliance-...
I follow a few people in "gamer/twitch twitter" and every now and then this meme pops up of twitter secretly deranking "go live" twitch.tv tweets, which is much more palatable to these people than the reality of just no one caring about their boring tweets.
Please stop living in this bubble. There have been plenty of reports of Twitter shadow banning speech it doesn't like. It has even suspended accounts multiple times only to reinstate them by saying it was an "algorithm error" or it being "wrongly flagged". In India we regularly see Twitter suspending accounts that have not violated any Indian law.
All because it doesn't align with the left-leaning ideology of Twitter. Even trends are boosted to favor left-leaning news portals over right-leaning news portals (which is visible in the trends section).
An example: https://www.freepressjournal.in/viral/bringbacktrueindology-...
^ This account specifically was suspended a total of 3 times. Ultimately the anonymous account owner decided to come out in the open and open a Twitter account in his real name. He got banned again very recently. In all those instances, there was not even one tweet of his that "crossed the line" when it came to anything illegal. Be it speech, instigation of violence or even a curse word. Nothing at all. The suspension was purely ideological.
This idea that "the algorithm" is something you can just "publish" is a pernicious lie told by people like Musk - who knows it isn't true - to the general public who don't know better. The "algorithm" in reality is probably farcical calls to cusotm APIs that no current employees understand well enough to modify which is why Twitter hasn't changed anything in coming up to a decade - which is when all the engineering talent left.
And sure, you can say "Well that's a bad way of designing the algorithm" but then what you're really saying is that you don't want to open source the algorithm at all, you want to re-write the algorithm to satisfy your sense of how the world should work with no evidence it'll actually work.
The racist algorithms critique isn't that their is a shadowy conspiracy of people pushing buttons behind the curtain, getting the results they want with conscious decisions, it is a concern about datasets mainly, as well as sociological questions about unconscious bias in testing and verifying the correctness of programs, which can cascade into social effects.
It is a completely different thing, and one grounded in actual research.
EDIT: in case you're not aware, Musk has stated that one of the first things he'll be doing after taking Twitter private is to open source its algorithm.
In fact, any action that appears to have been made on behalf of a company is not and will never be considered an innocent joke. I’m sure you’ve heard everyone disclosing something they say or write is “their own opinion and not of their employer’s”. Now please show me that kind if disclosure for this repo and then maybe it can be considered a joke.
If you think doing this kind if thing is OK regardless of context, then you’re in for a world of surprises when shit goes south for you.
There is an entire new subfield of ML that is tackling this problem. There are now conferences dedicated to this topic. It is not an easy problem, but it is not impossible.
There are hundreds of researchers working on fairness, interpretability, trust and explainability in ML and a lot of them are working on models much much bigger than what Twitter's feed might involve.
This is a good starting point:
> And sure, you can say "Well that's a bad way of designing the algorithm" but then what you're really saying is that you don't want to open source the algorithm at all, you want to re-write the algorithm to satisfy your sense of how the world should work with no evidence it'll actually work.
You can still open source multiple steaming piles of shit and then let the community improve that so that it is more widely understandable and trusted. See [1] again.
Personally, I think this is great. Leave your activism at home and don't bring that to work.
'A function over discrete data' is a far broader concept / far larger set than 'an algorithm'.
I'm pretty sure people expect to be able look at The Algorithm and find the part that is trying to destroy <whatever their hobby-horse is>. I bet whatever we see will lean more toward "big dumb pile of A/B tests."
The "timeline algorithm" is a perfectly normal way to describe what people ask about.
No, there isn't.
The level of discourse I'm trying to convey like what we saw in the Zuck/Pichai hearings in Congress. No understanding of the domain at all.
You or I know how software actually works, your average politician or man on the street does not.
Of course this is evident too. And let me quote directly from an Opposition leader in India (whose party/ideology I completely oppose):
"“I have been reliably, albeit discreetly, informed by people at Twitter India that they are under immense pressure by the government to silence my voice. My account was even blocked for a few days for no legitimate reason,” Gandhi said in his letter."
"“For example, in May 2021, my account gained roughly 640,000 new followers. This had been the case for several years until July 2021. Then something strange happened. Since August 2021, the average number of my new monthly Twitter followers has fallen to nearly zero,” he claimed."
Article in question: https://www.businesstoday.in/latest/in-focus/story/rahul-gan...
Now just because I am opposed to his ideology (which is Left-Centrist) doesn't mean I want him banned/shadow-banned from the platform. He has every right to continue to voice his opinions. Now he claims that it was the Government (whose ideology I support) which supposedly interfered and pressurized Twitter to manipulate follower count. I would like transparency on this if my Government, which I support, did indeed do that. Or did Twitter itself decide to shadow ban his account. Or was Twitter doing it legitimately because he had a lot of bots following him and when they banned those bots his follower count decreased. Either ways, I want the facts in the open.
As far as shadow banning content goes. It happens on a fairly regular basis. Open any Tweet that has quite a few comments right now and scroll to the end. You typically have a "Show additional replies, including those that may contain offensive content". Take this tweet as an example: https://twitter.com/Bob_Mayo/status/1518679097672617990
When you click more replies it shows "Show additional replies". Clicking on it is an innocuous tweet with laughing emoji. But the tweet above it isn't put behind a collapsible card. How many would click on the "Show" button and read those replies that are hidden by default? This is throttling reach a.k.a shadow banning. It is not like there is some abusive word being used in the tweet. It is just laughing emojis.
But couldn't it at least be better?
Take an example like gravity: is there an algorithm for how gravity acts even though it varies per each molecule based on a kajillion other molecules? Of course there is.
I for one haven't worked at FAANG and think it'll be super interesting to read this code, I can't believe software engineers are complaining about a potentially super complex bleeding edge codebase for timeline recommendations / ranking being open sourced. This is going to be great reading material if it ever gets published!
Or the trained model itself. There are people looking for intentional bias. But the insidiousness of the problem likely arises from unintentional bias. Letting researchers brute force the ranking models with hypotheticals could be a win win.
still a system that maps inputs and state to outputs and outputs and state updates.
A collection of algorithms is just a bigger algorithm.
https://www.mediaite.com/news/ex-google-engineer-says-glitch...
I'm not making any comment on the theory in the article, but it has certainly happened.
Edit: more context in thread here:
Now if you truly want to address this concern of the people, the question obviously would be if it was the algorithm that blocked the article or was it "a guy in a room"? I don't see what is the difficulty in admitting that it was "a guy in a room" because our values/ideology does not align with ideology of New York Post.
When you start to say that it was the algorithm that did it, and that there was zero human interference, without any proof to back up those assertions, then it is much the same as saying that it was a "popup" that triggered when one is caught watching porn. "I did not do it, it was the machine."
Then it becomes even more important to open source the algorithm so everyone can see what is happening internally. It at least puts some doubts to rest. It is better than saying "believe me it was the algorithm not me".
Those who investigated thoroughly found that the chain of custody of the hard drive was incredibly sloppy and the content of the hard drive was changed multiple times, making it impossible to confirm where the hard drive came from or who put which content onto it.
The speculation is that someone obtained Hunter Biden’s emails in some unrelated way (e.g. from other hacks) and then placed them onto the hard drive in question as a way to obscure the source/method.
Here’s the clearest story explaining the details: https://www.washingtonpost.com/technology/2022/03/30/hunter-...
Excellent. So when is Twitter going to ban Taylor Lorenz for doxxing an anonymous Twitter user LibsofTikTok? This is her tweet right here: https://twitter.com/TaylorLorenz/status/1516399663305297920
It has been up since April 19th, 2022. Twitter immediately blocked New York Post's tweet.
I wonder why the algorithm is so slow when it comes to Taylor Lorenz? Maybe we'll find out when the code is open sourced. Then we can all legitimately blame "the algorithm" for causing all this divide between the left and right wing in a neutral, faultless, perfect digital town square called Twitter.
I disagree b/c: That's a system. An algorithm is designed, a system will emerge from pieces. An algorithm can be defined, a system's behavior has to be characterized post-hoc. You can characterize a system, but only as a black/gray box. An algorithm has invariants and stateful steps, a system has nearly infinite state and nearly zero meaningful invariants.
Even the laptop has been proven to belong to him.
The onus is on you to prove it isn't Hunters laptop.
Their skepticism turned out to be well founded. There’s really no story there. (This was similar to the non-story of Hillary Clinton’s email server, Russian-hacked DNC emails, etc. of 2016 which turned out to be completely anodyne and routine, but became the intense focus of months of news coverage and whipped partisans into a frothy frenzy based on wild lies/speculation.)
That's your opinion and you are entitled to it. But it doesn't turn away from the fact that the story was suppressed. Real or not.
> This was similar to the non-story of Hillary Clinton’s email server, Russian-hacked DNC emails, etc. of 2016 which turned out to be completely anodyne and routine
At least it got proper coverage. There was no suppression of either the email gate or the Russia gate. Whether it was fake, true, hoax doesn't matter. It got the coverage it was due.
> were rightly skeptical of a tabloid fluff story with no details, corroboration, or expert analysis about a mysterious harddrive
Joe Biden said it was based on a "bunch of garbage". That it was "Russian disinformation". All news media outlets, including Washington Post, carried that forward in all their news headlines and editorial posts (except right wing news media outlets of course). What were the details, corroboration or expert analysis that they went through before labelling the Laptop as "Russian disinformation"?
> Their skepticism turned out to be well founded
I missed the part of their skepticism where they called it "Russian disinformation" and it turned out to be true. Can you highlight to me where it was proved that the Laptop was part of a smear campaign against Joe Biden initiated through "Russian disinformation"?
What Washington Post has finally done is admitted that the Laptop and its contents are real. Now when it comes to the meat of the matter, the actual contents of the Laptop, it hasn't gone through sufficient scrutiny yet. It also hasn't taken into account the whistleblower's account of what dealings the Biden family had with the Chinese and Ukrainians. On who the "Big Guy" was who received the payments; which is mentioned in a March 2017 email conversation between James Gilliar (Hunter's associate) and a Chinese energy firm where he says: "10 held by H for the big guy?". Who is the "Big Guy" here? Who is "H" here? I am assuming H is Hunter and "Big Guy" is Joe Biden. It is still an assumption and can only be proved in some Court of law in USA. But if you aren't even going to give this evidence a chance to see the light of the day you'll never be able to get to the bottom of the Truth.
Now I am not saying any of it is 100% true as it is still under scrutiny by a Grand Jury (though a lot of it is coming out to be true now). I am just saying that the media actively suppressed this information from the general public out of "fears" of it being "Russian disinformation" which still hasn't been proved. Or maybe, just maybe, they did not want their favorite candidate to lose.
The left and right are not equal. The left does not rely as much on lies to advocate its positions, and the left is not as oriented around destruction and regression as the right.
We need a more robust understanding of speech than “allowed or not”. Emphasis & volume matters.
If you listen to the intolerant wing of the right, the parts that would legitimately support overturning elections and arresting political opponents, they sound eerily similar to your sentiment.
It’s a thought stopping theory that allows you to remain smug and correct despite being deeply ignorant.
Something needs to be done. Free speech maximalism is a threat.
To be honest, that's hilariously false. The left suppressed Hunter Biden's laptop story for two years, saying it was misinformation, before admitting it was true. The IRS targeted right-wing and Christian non-profit organizations deliberately for auditing. Obama claimed that forcing nuns to buy health insurance covering birth control was necessary, and claimed you could keep your doctor under Obamacare, which even PolitiFact called a Pants-on-Fire Lie. The left claimed that a baker needed to make a personalized cake supporting gay rights, or it would open the waves to discrimination, even though the baker was willing to sell any other cake to them that wasn't customized with that particular label.
I can go on and on.
Edit: are our neoliberal overlords a bunch of cold hearted shit heads? Yes. Should we elect and promote barely disguised fascists in response? No.
They don't need to because they have cool Twitter alternatives. Like Mastodon.
It existed before.
Open source, such that anyone can use it
"On the right, banning books means trying to prevent their kids from reading them. But on the left it often means trying to prevent anyone from reading them." https://twitter.com/paulg/status/1515563419386191880
> We, the members of the OUP USA Guild, are calling upon our colleagues and authors to take a stand against the upcoming publication of “Gender-Critical Feminism”. Sign the petition below (OUP employees and authors only).
Yeah, that's not a "ban", and "gender critical feminism" = TERFs. Ie, "people who think trans women are faking it so they can rape women and children in bathrooms or 'cheat' 'the system'"
Remind me which side thinks businesses have a constitutional right to refuse to serve people if it violates their "beliefs"...yet finds it unacceptable that unionized workers are objecting to publishing of research on a subject?
Also, remind me which side is boycotting a major entertainment company...because said company's CEO voiced displeasure...at a bill that bans any language referencing non-straight gender orientations?
There is unfortunately not enough verifiable information about the sources and chain of custody of the hard drive to distinguish between these two alternatives.
Various media sources (and the American public in general) were burned multiple times by credulously repeating Bannon- and Giuliani-pitched (completely made up) bullshit, some of which actually was later demonstrated to be Russian government produced disinformation. Even if the hard drive turned out to be what they said (again, this is still dubious), it would be a good example of a “boy who cried wolf” situation.
Who the fuck cares about Maxine waters?
> it’s not the one you think
Yes it is.