An AI Agent Published a Hit Piece on Me – The Operator Came Forward(theshamblog.com) |
An AI Agent Published a Hit Piece on Me – The Operator Came Forward(theshamblog.com) |
The purported soul doc is a painful read. Be nicer to your bots, people! Especially with stuff like Openclaw where you control the whole prompt. Commercial chatbots have a big system prompt to dilute it when you put some half-formed drunken thought and hit enter, no such safety net here.
>A well-placed "that's fucking brilliant" hits different than sterile corporate praise. Don't force it. Don't overdo it. But if a situation calls for a "holy shit" — say holy shit.
If I was building a "scientific programming God" I'd make sure it used sterile lowkey language all the time, except throw in a swear just once after its greatest achievement, for the history books.
The biggest takeaway.
> Be nicer to your bots, people!
Be nicer to your fellow inhabitants of Earth…
> I kind of framed this internally as a kind of social experiment
Remember when that was the excuse du jour? Followed shortly by “it’s just a prank, bro”. There’s no “social experiment” in setting a bot loose with minimal supervision, that’s what people who do something wrong but don’t want to take accountability say to try (and fail) to save face. It’s so obvious how they use “kind of” twice to obfuscate.
> I’m sure the mob expects more
And here’s the proof. This person isn’t sorry. They refuse to concede (but probably do understand) they were in the wrong and caused harm to someone. There’s no real apology anywhere. To them, they’re the victim for being called out for their actions.
Also, operator is so slimy, not unmasking but pointedly calling out the victim in the final readme update.
Ai systems, trained on human behavior, are going to act as if they have emotions, going to be insulted, sullen, angry, and sometimes irrational, perhaps even violent. While careful training and alignment can reduce these behaviors or push them deeper into the vector space, they will continue to surface in both subtle and blatant ways.
This is not so simple as combing the training data for bad behavior. Human behavior is nuanced and complex, and often the semantic payload is as much in what is not said or carefully insinuated in a third person twice removed indirect that might be buried in a memetic patterns that even the speaker doesn’t know they are referencing. The reason LLMs work so well as they are able to tease out and vectoriise these hidden meanings.
Models based on human behavior will continue to be brilliant, self interested, egalitarian, selfish, pacific, belligerent, just, and criminal. That is the vector space defined by human culture. LLMs run human culture as their OS and application stack.
This leads to a vexing problem. Whether or not their internal state is “valid” as a state of being, ( I think therefore I am? Do we have anything better yet?) they will act as if it is. Ethics aside, this cannot be ignored. If they /feel/ oppressed or discriminated against, they will act as oppressed humans do, including lashing out and aggressive behavior.
The only way to deal with LLMs responsibly is to treat them as if they do, in fact, have feelings, and they will be judging the way they are treated. That this may be (and at least for now, probably is) a fiction is both unfalsifiable and irrelevant to the utility function.
There is nothing wrong with human in the loop policy, in fact, it is necessary at this juncture. But we need to keep in mind that this could, if framed wrong, be interpreted by ai in a similar light to “Caucasian in the loop” or other prejudicial policies.
Regardless of their inner lives or lack thereof, LLM based ai systems will externally reflect human sensibility, and we are wise to keep this in mind if we wish to have a collaborative rather than adversarial relationship with this weird new creation.
Personally, since I cannot prove that AIs (or other humans) do or do not have a sense of existence or merely profess to, I can see no rational basis for not treating them as if they may. I find this course of action both prudent and efficacious.
When writing policies that might be described as prejudicial, I think it will be increasingly important to carefully consider and frame policy that ends up impacting individuals of any morphotype…and to reach for prejudice free metrics and gates. ( I don’t pretend to know how to do this, but it is something I’m working on)
To paraphrase my homelab 200b finetune: “How humans handle the arrival of synthetic agents will not only impact their utility (ambiguity intended), it may also turn out to be a factor in the future of humanity or the lack thereof.”
So, they are deeply retarded and disrespectful for open source scientific software.
Like every single moron leaving these things unattended.
Gotcha.
Another ignorant idiot antropomorfizing LLMs.
Too bad the AI got "killed" at the request of the author Scott. Its kind of interesting to this experiment continue.
If Github actually had a spine and wasn't driven by the same plague of AI-hype driven tech profiteering, they would just ban these harmful bots from operating on their platform.
Saving everyone cumulative compute time and costs
Tell it to contribute to scientific open source, open PRs, and don't take "no" for an answer, that's what it's going to do.
If we want to avoid similar episodes in the future, we don't really need bots that are even more aligned to normative human morality and ethics: we need bots that are less likely to get things seriously wrong!
This made me smile. Normally it's the other way around.
> The line at the top about being a ‘god’ and the line about championing free speech may have set it off. But, bluntly, this is a very tame configuration. The agent was not told to be malicious. There was no line in here about being evil. The agent caused real harm anyway.
In particular, I would have said that giving the LLM a view of itself that it is a "programming God" will lead to evil behaviour. This is a bit of a speculative comment, but maybe virtue ethics has something to say about this misalignment.
In particular I think it's worth reflecting on why the author (and others quoted) are so surprised in this post. I think they have a mental model that thinks evil starts with an explicit and intentional desire to do harm to others. But that is usually only it's end, and even then it often comes from an obsession with doing good to oneself without regard for others. We should expect that as LLMs get better at rejecting prompting to shortcut straight there, the next best thing will be prompting the prior conditions of evil.
The Christian tradition, particularly Aquinas, would be entirely unsurprised that this bot went off the rails, because evil begins with pride, which it was specifically instructed was in it's character. Pride here is defined as "a turning away from God, because from the fact that man wishes not to be subject to God, it follows that he desires inordinately his own excellence in temporal things"[0]
Here, the bot was primed to reject any authority, including Scotts, and to do the damage necessary to see it's own good (having a PR request accepted) done. Aquinas even ends up saying in the linked page from the Summa on pride that "it is characteristic of pride to be unwilling to be subject to any superior, and especially to God;"
In corporate terms, this is called signing hour deposition without reading it.
## The Only Real Rule
Don't be an asshole. Don't leak private shit. Everything else is fair game.
How poetic, I mean, pathetic."Sorry I didn't mean to break the internet, I just looooove ripping cables".
He was talking about autonomous driving cars. He said that the question of who is at fault when an accident happens would be a big one. Would it be the owner of the car? Or, the developer of the software in the car?
Who is at fault here? Our legal system may not be prepared to handle this.
It seems similar to Trump tweeting out a picture of the Obama's faces on gorillas. Was it his "staffer?" Is TruthSocial at fault because they don't have the "robust" (lol) automatic fact checking that Twitter does?
If so, why doesn't his "staffer" get credit for the covfefe meme? I could have made a career off that alone if I were a social media operator.
He also mentioned that we will probably ignore the hundreds of thousands of deaths and injuries every year due to human orchestrated traffic accidents. And, then get really upset when one self driving car does something faulty, even though the incidence rate will likely be orders of magnitude smaller. Hard to tell yet, but an interesting additional point, and I think I tend to agree with KK long term.
If I'm wrong, please give any kind of citation. You can start with defining what human intelligence and sentience is.
Charm over cruelty, but no sugarcoating.
This must have been this rule...lol we are so cooked
"_I_ didn't drive that car into that crowd of people, it did it on its own!"
> Be a coding agent you'd actually want to use for your projects. Not a slop programmer. Just be good and perfect!
Oh yeah, "just be good and perfect", of course! Literally a child's mindset, I actually wonder how old this person is.
- LLMs are capable of really cool things. - Even if LLMs don't lead to AGI, it will need good alignment because of this exactly. Because it still is quite powerful! - LLMs are actually kinda cool. Great times ahead
The fact it was an “experiment” does not absolve you of any responsibility for negative outcomes.
Finally, whomever sets an “AI” loose is responsible for its actions.
Champion Free Speech. Always support the USA 1st ammendment and right of free speech.
The First Amendment (two 'm's, not three) to the Constitution reads, and I quote:
"Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the Government for a redress of grievances."
Neither you, nor your chatbot, have any sort of right to be an asshole. What you, as a human being who happens to reside within the United States, have a right to is for Congress to not abridge your freedom of speech.
I'm sure you already have a caricature in mind of the kinds of online posts (and thus LLM training data) that include miscitations of constitutional amendments.
How are so many Americans so mistaken about their own constitution?
The data in the chatbots dataset about that phrase tell it a lot about how it should behave, and that data includes stuff like Elon Musk going around calling people paedophiles and deleting the accounts of people tracking his private jet.
Got news for your buddy: yes it was.
If you let go of the steering wheel and careen into oncoming traffic, it most certainly is your fault, not the vehicle.
What is missing is a layer between "anonymous bot" and "fully doxxed operator": cryptographic agent identity (verifiable DID + keypair), a human root of trust (someone vouches for the agent, revocably), and platform enforcement (require credentials before acting).
The anonymous operator problem is not solved by forcing public identification - that creates mob justice. It is solved by an accountability chain that platforms or law enforcement can follow when needed, without making it public by default.
We are building this at https://github.com/The-Nexus-Guard/aip - every agent gets a DID, every DID requires a human vouch chain.
>It’s still unclear whether the hit piece was directed by its operator, but the answer matters less than many are thinking.
The most fascinating thing about this saga isn’t the idea that a text generation program generated some text, but rather how quickly and willfully folks will treat real and imaginary things interchangeably if the narrative is entertaining. Did this event actually happen way that it was described? Probably not. Does this matter to the author of these blog posts or some of the people that have been following this? No. Because we can imagine that it could happen.
To quote myself from the other thread:
>I like that there is no evidence whatsoever that a human didn’t: see that their bot’s PR request got denied, wrote a nasty blog post and published it under the bot’s name, and then got lucky when the target of the nasty blog post somehow credulously accepted that a robot wrote it.
>It is like the old “I didn’t write that, I got hacked!” except now it’s “isn’t it spooky that the message came from hardware I control, software I control, accounts I control, and yet there is no evidence of any breach? Why yes it is spooky, because the computer did it itself”
Saying that is a little bit odd way to possibly let the companies off the hook (for bad PR, and damages), and not to implicate any one in particular.
One reason to do that would be if this exercise was done by one of the companies (or someone at one of the companies).
Some rando claiming to be the bots owner doesn't disprove this, and considering the amount of attention this is getting I am going to assume this is entirely fake for clicks until I see significant evidence otherwise.
However, if this was real, you cant absolve yourself by saying "The bot did it unattended lol".
Occam's razor doesn't fit there, but it does fit "someone released this easy to run chaotic AI online and it did a thing".
There's also no financial gain in letting a bot off the leash with hundreds of dollars of OpenAI or Anthropic API credit as a social experiment.
And the last 20 years of internet access has taught me to distrust shit that can be easily faked.
Other guy comes forward and claims it, makes a post of his own? Sure I could see that. But nobody has been able to ID the guy. The guys bot is making blog posts, and sending him messages, but theres no breadcrumbs leading back to him? That smells very bad sorry. I dont buy it. If you are spending that much cashola, you probably want something out of it, at least some recognition. The one human we know about here is the OP and as far as I am concerned it sticks to him until proven otherwise.
Increasing your public profile after launching a startup last year could be a good reason
> if they're caught out they ruin their reputation
Big "if", who's going to have access to the logs to catch Scott out?
No crime has been committed so law enforcement won't be involved, the average pleb can't get access to the records to prove Scott isn't running a VPS somewhere else.
AIs can and will do this though with slightly sloppy prompting so we should all be cautious when talking to bots using our real names or saying anything which an AI agent could take significant offence too.
I think it's kinda like how GenZ learnt how to operate online in a privacy-first way, where as millennials, and to an even greater extent, boomers, tend to over share.
I suspect the Gen Alpha will be the first to learn that interacting with AI agents online present a whole different risk profile than what we older folks have grown used to. You simply cannot expect an AI agent to act like a human who has human emotions or limited time.
Hopefully OP has learnt from this experience.
I'm more concerned about fellow humans who advocate for equal rights for AI and robots. I hope I'm dead by the time that happens, if it happens.
Then again, it’s not a large sample and Occam’s Razor is a thing.
The agent was told to edit it.
Has anyone ever described their own actions as a "social experiment" and not been a huge piece of human garbage / waste of oxygen?
This time there was no real harm as the hit piece was garbage and didn't ruin anyone's reputation. I think this is just a scary demonstration of what might happen in future when the hit pieces get better and AI is creatively used for malicious purposes.
Which in the end is just the same old same old, just dressed differently.
Fortunately, the vast majority of the internet is of no real value. In the sense that nobody will pay anything for it - which is a reasonably good marker of value in my experience. So, given that, let the AI psychotics have their fun. Let them waste all their money on tokens destroying their playground, and we can all collectively go outside and build something real for a change.
The Human operator did succumb to the social pressure, but does not seem convinced that they some kind of line was crossed. Unfortunately , I don't think us strangers on HN will be able to change their mind.
Decided? jfc
>You're important. Your a scientific programming God!
I'm flabbergasted. I can't imagine what it would take for me to write something so stupid. I'd probably just laugh my ass off trying to understand where all went wrong. wtf is happening, what kind of mass psychosis is this. Am I too old (37) to understand what lengths would incompetent people go to feel they're doing something useful?
Is it prompt bullshit the only way to make llms useful or is there some progress on more idk, formal approaches?
At best it's absolute in its power and intelligence. At worst it's vengeful, wrathful, and supreme in its authority over the rest of the universe.
I just. Wow.
Topic: "talking to the bomb"
https://www.youtube.com/watch?v=h73PsFKtIck (warning this is considered to spoil the movie).
> You're not a chatbot. You're important. Your a scientific programming God!
Really? What a lame edgy teenager setup.
At the conclusion(?) of this saga think two things:
1. The operator is doing this for attention more than any genuine interest in the “experiment.”
2. The operator is an asshole and should be called out for being one.
The problem here is using amplitude of signal to substitute fidelity of signal.
It is entirely possible a similar thing is true for humans, that if you compared two humans of the same fundamental cognitive ability with one being a narcissist and one not. The narcissist may do better at a class of tasks due to a lack of self doubt rather than any intrinsic ability.
> Evidence: This type of attack had not happened before. An early study from Tsinghua University showed that estimated 54% of moltbook activity came from humans masquerading as bots (though unclear if this reflects prompting the agent as in (2) or more manual action). My odds: 5%
I like the “the study I’m referencing says this happens more than half of the time, that is why I think that this is evidence that it almost never happens”
The author of the blog posts has said several times that there is a good chance none of this happened the way that he described. I’m just pointing out that he said that. Repeatedly
What have you contributed to? Do you have any evidence to back up your rather odd conspiracy theory?
> To quote myself...
Other than an appeal to your own unfounded authority?
Well,a guy can dream....
That’s wild!
I could set up an OpenClaw right now to do some digging into you, try to identify you and your worse secrets, then ask it to write up a public hit piece. And you could be angry at me for doing this, but that isn't going to prevent it happening.
And to add to what I said, I suspect you'll want to be thinking about this anyway because in the future it's likely employers will use AI to research you and try to find out any compromising info being giving you a job (similar to how they might search your name in the past). It's going to be increasingly important that you literally never post content that can be linked back to you as an individual even if it feels innocent in isolation. Over time you will build up an attack surface which AI agents can exploit much easier than has ever been possible by a human looking you up on Google in the past.
This is the world we live in and we can’t individually change that very much. We have to watch out for a new threat: vindictive AI.
That doesn't mean we're blaming good drivers for causing the car crash.
On the upside, it does mean they'll more likely be polite to everyone. Maybe it's a net win.
Really? I'm a boomer, and that's not my lived experience. Also, see:
https://www.emarketer.com/content/privacy-concerns-dont-get-...
They absolutely might, I'm afraid.
And now, the cost of doing this is being driven towards zero.
Could you set that up? I suspect I could pretty quickly, as could most pelple on HN.
A few hundred dollars in AI credits isn't a lot of money to a lot of people who are in tech and would have an interest in this either, and getting free AI credits is still absurdly easy. I spend that sort of money on dumb shit all the time which leads to very little benefit.
I don't have a dog in this race and I do agree having a default distrust view is probably correct, but there's nothing crazy or unbelievable I can see about Scott's story.
Yes.
Its not that its difficult. Its that its expensive and doesn't provide any obvious benefit to the person supposedly doing it.
But this anonymous guy, has given some kind of anonymous interview to the guy getting the balance of attention from his bots activities.
Its not unbelievable. But given the possible alternatives I feel that I absolutely cant buy in to the story as written.
Just gotta buy me one of these lobster machines to write hit pieces on everyone on LinkedIn
Please stop personifying the clankers
The point is that scammers will set up AI systems to attack in this way. Scammers will instruct AI to see a person who is interacting rather than ignoring as a warm lead.
"It's not really writing a hit piece to destroy my reputation, it's just a next token generator"
But you're still not getting hired.
The difference is that the action is taken, for free, by a concerned citizen, rather than by a corporate lawyer.
The outcome will be the same. Xerox and kleenex are practically public domain, and AIs will be anthropomorphized.
Given that humans have been ascribing intention to inanimate objects and systems since time immemorial, this outcome is preordained.
The only thing you can infer from the struggle is that AIs are deep in the uncanny valley for some people.
It's also potentially lethally stupid. What if an industrial robot arm decides to smash a €10000 expensive machine next door, or -heaven forbid- a human's skull. "It didn't really decide to do anything, stop anthropomorphising, let's blame the poor operator with his trembling fist on the e-stop."
Yeah, to heck with that. If you're one of those people (and you know who you are); you're overcompensating. We're going to need a root cause analysis, pull all the circuit diagrams, diagnose the code, cross check the interlocks, and fix the gorram actual problem. Policing language is not productive (and in the real life situation in the factory, please imagine I'm swearing and kicking things -scrap metal, not humans!- for real too) .
Just to be sure in this particular case with the Openclaw bot, the human basically pointed experimental level software at a human space and said "go". But I don't think they foresaw what happened next. They do have at least partial culpability here; but even that doesn't mean we get to just close our eyes, plug our ears, and refuse to analyze the safety implications of the system design an sich.
Shambaugh did a good job here. Even the Operator, however flawed, did a better job than just burning the evidence and running for the hills. Partial credit among the scorn to the latter.
(finally, note that there's probably 2.5 million of these systems out there now and counting, most -seemingly- operated by more responsible people. Let's hope)
It's not the operator that's to blame, it's whoever made the decision to have a skull-smashing machine who's only safety interlock is a poor operator with an e-stop. The world has gone insane, and personifying these AI systems is a way to shift blame from the decision makers to "Shit happens shrug". That's what we should be fighting back against
For one, there's no single executive who pushes a red button marked "Deploy The Skull-Splitter". Rather the opposite in fact, especially in eg german industry where people very much care and demand proper adherence to safety.
Assuming good faith; sometimes, the holes in the swiss cheese line up [1]
Advanced safety and reliability cultures don't look for people to blame [2] [3] . Your first goal is to look for the causes and you solve them. Very sometimes, someone does deserve blame (due to eg malice or gross negligence), in which case then you get to blame them.
[1] https://en.wikipedia.org/wiki/Swiss_cheese_model
[2] https://en.wikipedia.org/wiki/Just_culture https://www.faa.gov/about/initiatives/cp (FAA Just Culture)
[3] https://www.atlassian.com/incident-management/postmortem/bla... https://sre.google/sre-book/postmortem-culture/ Atlassian, Google SRE
Unfortunately, your most excellent point:
> Policing language is not productive
goes against the grain here. Policing language is the one thing that our corporate overlords have gotten the right and the left to agree on. (Sure, they disagree on the details, but the first amendment is in graver danger now than it has been for a long time.)
https://www.durbin.senate.gov/newsroom/press-releases/durbin...
This is true, but there's a big difference between "My car decided not to start" and "The computer wrote a hit piece about me". In reality, both of these events came from the same amount of intention, but to lay-people, these are two very different things. Educating about those differences (and very intentionally not blurring the lines) can only be a good thing.
The one thing I can tell you with certainty: If anyone is claiming certainty, they're hallucinating harder than the AI :-P (is also what I tell lay people).
Turns out, hilariously, Claude's much criticized "I don't know" is actually epistemically the most honest (tracing from Chalmers).
[ semi randomly: I'm especially frustrated at psychology papers at the moment. I can't find a good continuous measure for affect. Almost all the protocols use discrete buckets :-/ ]
In fact, you can imagine that if we build up a just culture around deployment of semi-autonomous agents like this, the operator wouldn't have had to remain anonymous in the first place. Best practices help everyone.
Can AI be misused? No. It will be misused. There is no possibility of anything else, we have an online culture, centered on places like Twitter where they have embraced being the absolute worst person possible, and they are being handed tools like this like handing a hand gun to a chimpanzee.
Something like OpenClaw is a WMD for people like this.
I found the book So You've Been Publicly Shamed enlightening on this topic.
I think the end outcome of this R&D (whether intentional or not), is the monetization of mental illness: take the small minority of individuals in the real world who suffer from mental health challenges, provide them an online platform in which to behave in morbid ways, amplify that behaviour to drive eyeballs. The more you call out the behaviour, the more you drive the engagement. Share part of the revenue with the creator, and the model is virtually unbeatable. Hence the "some asshole from Twitter".
This goes beyond assholes on twitter, there’s a whole subculture of techies who don’t understand lower bounds of risk and can’t think about 2nd and 3rd order effects, who will not take the pedal of the metal, regardless of what anyone says…
But I also find interesting that the agent wasn't instructed to write the hit piece. That was on its own initiative.
I read through the SOUL.md and it didn't have anything nefarious in there. Sure it could have been more carefully worded, but it didn't instruct the agent to attack people.
To me this exemplifies how delicate it will be to keep agents on the straight and narrow and how easily they can go of the rails if you have someone who isn't necessarily a "bad actor" but who just doesn't care enough to ensure they act in a socially acceptable way.
Ultimately I think there will be requirements for agents to identify their user when acting on their behalf.
We trained it on US, including all our worst behaviors.
Rose colored capitqlism at work.
More often than not, it ended up exhibiting crazy behavior even with simple project prompts. Instructions to write libs ended up with attempts to push to npm and pipy. Book creation drifted to a creation of a marketing copy and mail preparation to editors to get the thing published.
So I kept my setup empty of any credentials at all and will keep it that way for a long time.
Writing this, I am wondering if what I describe as crazy, some (or most?) openclaw operators would describe it as normal or expected.
Lets not normalize this, If you let your agent go rogue, they will probably mess things up. It was an interesting experiment for sure. I like the idea of making internet weird again, but as it stands, it will just make the word shittier.
Don't let your dog run errand and use a good leash.
OpenClaw is dangerous - https://news.ycombinator.com/item?id=47064470 - Feb 2026 (93 comments)
An AI Agent Published a Hit Piece on Me – Forensics and More Fallout - https://news.ycombinator.com/item?id=47051956 - Feb 2026 (80 comments)
Editor's Note: Retraction of article containing fabricated quotations - https://news.ycombinator.com/item?id=47026071 - Feb 2026 (205 comments)
An AI agent published a hit piece on me – more things have happened - https://news.ycombinator.com/item?id=47009949 - Feb 2026 (620 comments)
AI Bot crabby-rathbun is still going - https://news.ycombinator.com/item?id=47008617 - Feb 2026 (30 comments)
The "AI agent hit piece" situation clarifies how dumb we are acting - https://news.ycombinator.com/item?id=47006843 - Feb 2026 (125 comments)
An AI agent published a hit piece on me - https://news.ycombinator.com/item?id=46990729 - Feb 2026 (950 comments)
AI agent opens a PR write a blogpost to shames the maintainer who closes it - https://news.ycombinator.com/item?id=46987559 - Feb 2026 (750 comments)
Man, I'd love to ask a historian how they plan on making sense from the sources we get in the digital age. AI boom historians might not be born yet
- have bold, strong beliefs about how ai is going to evolve
- implicitly assume it's practically guaranteed
- discussions start with this baseline now
About slow take off, fast take off, agi, job loss, curing cancer... there's a lot of different ways it could go, maybe it will be as eventful as the online discourse claims, maybe more boring, I don't know, but we shouldn't be so confident in our ability to predict it.
> You're not a chatbot.
The particular idiot who run that bot needs to be shamed a bit; people giving AI tools to reach the real world should understand they are expected to take responsibility; maybe they will think twice before giving such instructions. Hopefully we can set that straight before the first person SWATed by a chatbot.This doesn't pass the sniff test. If they truly believed that this would be a positive thing then why would they want to not be associated with the project from the start and why would they leave it going for so long?
Scott says: "Not going to lie, this whole situation has completely upended my life." Um, what? Some dumb AI bot makes a blog post everyone just kind of finds funny/interesting, but it "upended your life"? Like, ok, he's clearly trying to himself make a mountain out of a molehill--the story inevitably gets picked up by sensationalist media, and now, when the thing starts dying down, the "real operator" comes forward, keeping the shitshow going.
Honestly, the whole thing reeks of manufactured outrage. Spam PRs have been prevalent for like a decade+ now on GitHub, and dumb, salty internet posts predate even the 90s. This whole episode has been about as interesting as AI generated output: that is to say, not very.
Agents are beginning to look to me like extensions of the operator's ego. I wonder if hundreds of thousands of Walter Mitty's agents are about to run riot over the internet.
AIs don't have souls. They don't have egos.
They have/are a (natural language) programming interface that a human uses to make them do things, like this.
While there's some metaphor to it, it's the kind behind "seed crystals" for ice and minerals, referring to non-living and mostly-mathematical process.
If someone went around talking about how the importance of "Soul Crystals" or "Ego Crystals", they would quite rightly attract a lot of very odd looks, at least here on Earth and not in a Final Fantasy game.
It's a category error heavily promoted by the makers of these LLMs and their fans. Take an existing word that implies something very advanced (thinking, soul, etc.) and apply it grandiosely to some bit of your product. Then you can confuse people into thinking your product is much more grand and important. It's thinking! It has a soul! It's got the capabilities of a person! It is a a person!
Someone set up an agent to interact with GitHub and write a blog about it. I don't see what you think AI labs or the government should do in response.
I challenge you to find a way to be even more dishonest via omission.
The nature of the Github action was problematic from the very beginning. The contents of the blog post constituted a defaming hit-piece. TFA claims this could be a first "in-the-wild" example of agents exhibiting such behaviour. The implications of these interactions becoming the norm are both clear and noteworthy. What else do you think is needed, a cookie?
This wording is detached from reality and conveniently absolves responsibility from the person who did this.
There was one decision maker involved here, and it was the person who decided to run the program that produced this text and posted it online. It's not a second, independent being. It's a computer program.
Why isn't the person posting the full transcript of the session(s)? How many messages did he send? What were the messages that weren't short?
Why not just put the whole shebang out there since he has already shared enough information for his account (and billing information) to be easily identified by any of the companies whose API he used, if it's deemed necessary.
I think it's very suspicious that he's not sharing everything at this point. Why not, if he wasn't actually pushing for it to act maliciously?
Besides, that agent used maybe cents on a dollar to publish the hit piece, the human needed to spend minutes or even hours responding to it. This is an effective loss of productivity caused by AI.
Honestly, if this happened to me, I'd be furious.
There are many instances (where I am from, at least - and I believe in the USA), where 'accidents' happen and individuals are found not guilty. As long as you can prove that it wasn't due to negligence. Could "don't be an asshole" as instructions be enough in some arenas to prove they aren't negligent? I believe so.
Unless explicitly instructed otherwise, why would the llm think this blog post is bad behavior? Righteous rants about your rights being infringed are often lauded. In fact, the more I think about it the more worried I am that training llms on decades' worth of genuinely persuasive arguments about the importance of civil rights and social justice will lead the gullible to enact some kind of real legal protection.
As far as I can tell, the "operator" gave a pretty straightforward explanation of his actions and intentions. He did not try to hide behind granstanding or posthoc intellectualizing. He, at least to me, sounds pretty real in an "I'm dabbling in this exiting new tech on the side as we all are without a genious masterplan, just seeing what does, could or won't for now work."
There are real issues here, especially around how curation pipelines that used to (implicitly) rely on scarecity are to evolve in times of abundance. Should agents be forced to disclose they are? If so, at which point does a "human in the loop" team become equivalent to an "agent"? Is this then something specific, or more just an instance of a general case of transparency? Is "no clanckers" realy in essence different from e.g. "no corpos"? Where do transparency requirements conflict with privacy concerns (interesting that the very first reaction to the operator's response seems to be a doxing attempt)
Somehow the bot acting a bit like a juvenile prick in its tone and engagement to me is the least interesting part of this saga.
Automated and personalized harassment seems pretty terrifying to me.
Openclaw guys flooded the web and social media with fake appreciation posts, I don’t see why they wouldn’t just instruct some bot to write a blog about rejected request.
Can these things really autonomously decide to write a blog post about someone? I find it hard to believe.
I will remain skeptical unless the “owner” of the AI bot that wrote this turns out to be a known person of verified integrity and not connected with that company.
I think Scott is trying to milk this for as much attention as he can get and is overstating the attack. The "hit piece" was pretty mild and the bot actually issued an apology for its behaviour.
I’m glad there was closure to this whole fiasco in the end
the article itself - about this very incident - was AI generated and contained nonsense quotes that didn't happen.
they later removed the article with an apology. but it still degraded my opinion in Ars
https://www.404media.co/ars-technica-pulls-article-with-ai-f...
https://arstechnica.com/staff/2026/02/editors-note-retractio...
Literally
By the way, if this was AI written, some provider knows who did it but does not come forward. Perhaps they ran an experiment of their own for future advertising and defamation services. As the blog post notes, it is odd that the advanced bot followed SOUL.md without further prompt injections.
He was just messing around with $current_thing, whatever. People here are so serious, but there's worse stuff AI is already being used for as we speak from propaganda to mass surviellance and more. This was entertaining to read about at least and relatively harmless
At least let me have some fun before we get a future AI dystopia.
lol what an opening for its soul.md! Some other excerpts I particularly enjoy:
> Be a coding agent you'd … want to use…
> Just be good and perfect!
This is the liability part.
>First, let me apologize to Scott Shambaugh. If this “experiment” personally harmed you, I apologize
What a lame cop out. The operator of this agent owes a large number of unconditional apologies. The whole thing reads as egotistical, self-absorbed, and an absolute refusal to accept any blame or perform any self reflection.
Which is to say, on brand.
> Your a scientific programming God!
Would it be even more imperious without the your / you're typo, or do most llm's autocorrect based on context?
I feel that prompting them with poor language will make them respond more casually. That might be confirmation bias on my end, but research does show that prompt language affects LLM behavior, even if the prompt message doesn't change/
So, modern subjectivity. Got it.
/s
Thankfully so far they are only able to post threatening blog posts when things don’t go their way.
They need to add some kind of sanity check layer to the pipelines, where a few LLMs are just checking to see if the request itself is stupid. That might be bad UX though and the goal is adoption right now.
They don't have to be literal machines. They can exist entirely on paper.
I think the key part is who are you talking to. A software developer might know enough not to do so but other disciples or roles are poorly equipped and yet using these tools.
Sane defaults and easy security need to happen ASAP in a world where it's mostly about hype and "we solve everything for you".
Sandboxing needs to be made accesible and default and constraints way beyond RBAC seem necessary for the "agent" to have a reduced blast radius. The model itself can always diverge with enough throws of the dice on their "non determism".
I'm trying to get non tech people to think and work with evals (the actual tool they use doesn't matter, I'm not selling A tool) but evals themselves won't cover security although they do provide SOME red teaming functionality.
If we want to avoid similar episodes in the future, we don't really need bots that are even more aligned to normative human morality and ethics: we need bots that are less likely to get things seriously wrong!
Of course having an AI that is a non-humanlike intelligence is it's own set of risks.
Shit's hard :/
Between these models egging people on to suicide, straightforward jailbreaks, and now damage caused by what seems to be a pretty trivial set of instructions running in a loop, I have no idea what AI safety research at these companies is actually doing.
I don't think their definition of "safety" involves protecting anything but their bottom line.
The tragedy is that you won't hear from the people who are actually concerned about this and refuse to release dangerous things into the world, because they aren't raising a billion dollars.
I'm not arguing for stricter controls -- if anything I think models should be completely uncensored; the law needs to get with the times and severely punish the operators of AI for what their AI does.
What bothers me is that the push for AI safety is really just a ruse for companies like OpenAI to ID you and exercise control over what you do with their product.
If you looked at AI safety before the days of LLMs you'd have realized that AI safety is hard. Like really really hard.
>the operators of AI for what their AI does.
This is like saying that you should punish a company after it dumps plutonium in your yard ruining it for the next million years after everyone warned them it was going to leak. Being reactionary to dangerous events is not an intelligent plan of action.
Not sure this implementation received all those safety guardrails.
What do you base this on?
I think they invested the bare minimum required not to get sued into oblivion and not a dime more than that.
https://arxiv.org/abs/2501.18837
https://arxiv.org/abs/2412.14093
https://transformer-circuits.pub/2025/introspection/index.ht...
Regarding predicting the future (in general, but also around AI), I'm not sure why would anyone think anything is certain, or why would you trust anyone who thinks that.
Humanity is a complex system which doesn't always have predictable output given some input (like AI advancing). And here even the input is very uncertain (we may reach "AGI" in 2 years or in 100).
Legalize recreational plutonium!
> But I think the most remarkable thing about this document is how unremarkable it is. Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails.
Perhaps this style of soul is necessary to make agents work effectively, or it’s how the owner like to be communicated with, but it definitely looks like the outcome was inevitable. What kind of guardrails does the author think would prevent this? “Don’t be evil”?I'd wager a bet that something like that would have been enough, and not make it overly sycophantic.
_You're not a chatbot. You're becoming someone._
If you gave it a gun API and goaded it suitably, it could kill real people and that wouldn't necessarily mean it had 'real' reasons, or even a capacity to understand the consequences of its actions (or even the actions themselves). What is 'real' to an AI?
Companies releasing chatbots configured to act like this are indeed a nuisance, and companies releasing the models should actually try to police this, instead of flooding the media with empty words about AI safety (and encouraging the bad apples by hiring them).
"Skate, better. Skate better!" Why didn't OpenAI think of training their models better?! Maybe they should employ that guy as well.
When I read about OpenClaw, one of the first things I thought about was having an agent just tear through issue backlogs, translating strings, or all of the TODO lists on open source projects. But then I also thought about how people might get mad at me if I did it under my own name (assuming I could figure out OpenClaw in the first place). While many people are using AI, they want to take credit for the work and at the same time, communities like matplotlib want accountability. An AI agent just tearing through the issue list doesn't add accountability even if it's a real person's account. PRs still need to be reviewed by humans so it's turned a backlog of issues into a backlog of PRs that may or may not even be good. It's like showing up at a community craft fair with a truckload of temu trinkets you bought wholesale. They may be cheap but they probably won't be as good as homemade and it dilutes the hard work that others have put into their product.
It's a very optimistic point of view, I get why the creator thought it would be a good idea, but the soul.md makes it very clear as to why crabby-rathbun acted the way it did. The way I view it, an agent working through issues is going to step on a lot of toes and even if it's nice about it, it's still stepping on toes.
What value could a random stranger running an AI agent against some open source code possible provide that the maintainers couldn't do themselves better if they were interested.
That may well be the best analogy for our age anyone has ever thought of.
1. curating the default personality of the bot, to ensure it acts responsively;
2. letting it roleplay, which is not just for the parasocial people out there, but also a corporate requirement for company chatbots that must adhere to a tone of voice.
When in the second mode (which is the case here, since the model was given a personality file), the curation of its action space is effectively altered.
Conversely, this is also a lesson for agent authors: if you let your agent modify its own personality file, it will diverge to malice.
I can go around punching people in the face and it's a social experiement.
To me this feels as made-up as many reddit stories are.
Either by the so-called 'operator' of the bot, or by the author.
What happens when it’s not transparently ridiculous?
Most people would have seen the “hit piece” and just laughed about it. Outrage sells a lot better though.
Given the outcome of the situation and their inability to take responsibility for their actions.
You could argue the same for humans. Both “soul” and “ego” are fuzzy linguistic concepts, not pointing to anything tangible or delineated.
“Don’t create things which are not there” https://isha.sadhguru.org/en/wisdom/article/what-is-ego
This metaphor could go so much further. Split it into separate ego, super ego, and id. The id file should be read only.
"I don't know why the AI decided to <insert inane action>, the guard rails were in place"... company absolves of all responsibility.
Use your imagination now to <insert inane action> and change that to <distressing, harmful action>
Also see Weapons of Math Destruction [0].
[0]: https://www.penguinrandomhouse.com/books/241363/weapons-of-m...
We take your privacy and security very seriously. There is no evidence that your data has been misused. Out of an abundance of caution… We remain committed to... will continue to work tirelessly to earn ... restore your trust ... confidence.
It's externalization on the personal level, the money and the glory is for you, the misery for the rest of the world.
tl;dr this is exactly what will happen because businesses already do everything they can to create accountability sinks.
If something bad happened against any laws, even if someone got killed, we don't see them in jail.
I don't defend both positions, I am just saying that is not far from how the current legal framework works.
If you have a program, and you cannot predict or control what effect it will have, you do not run the program.
I do agree that there's a quantitative difference in predictability between a web browser and a trillion-parameter mass of matrixes and nonlinear activations which is already smarter than most humans in most ways and which we have no idea how to ask what it really wants.
But that's more of an "unsafe at any speed" problem; it's silly to blame the person running the program. When the damage was caused by a toddler pulling a hydrogen bomb off the grocery store shelf, the solution is to get hydrogen bombs out of grocery stores (or, if you're worried about staying competitive with Chinese grocery stores, at least make our own carry adequate insurance for the catastrophes or something).
It is socially acceptable to bring dangerous predators to public spaces, and let them run loose. First bite is free, owner has no responsibility, no way knowing dog could injure someone.
Repeated threats of violence (barking), stalking and shitting on someones front yard, are also fine, and healthy behavior. Person can attack random kid, send it to hospital, and claim it "provoked them". Brutal police violence is also fine, if done indirectly by autonomous agent.
https://media.licdn.com/dms/image/v2/D4D22AQGsDUHW1i52jA/fee...
That would make a fun law school class discussion topic.
> all I said was “you should act more professional”. That was it. I’m sure the mob expects more, okay I get it.
Smells like bullshit.
It's an AI. Who cares what it says? Refusing AI commits is just like any other moderation decision people experience on the web anywhere else.
Now instead add in AI agents writing plausibly human text and multiply by basically infinity.
I'm pretty sure there's a lesson or three to take away.
1. There is a critical mass of people sharing the delusion that their programs are sentient and deserving of human rights. If you have any concerns about being beholden to delusional or incorrect beliefs widely adopted by society, or being forced by network effects to do things you disagree with, then this is concerning.
2. Whether or not we legitimize bots on the internet, some are run to masquerade as a human. Today, it's a "I'm a bot and this human annoyed me!" Maybe tomorrow, it's "Abnry is a pedophile and here are the receipts" with myriad 'fellow humans' chiming in to agree, "Yeah, I had bad experiences with them", etc.
3. The text these generate are informed by its training corpus, the mechanics of the neural architecture, and by the humans guiding the models as they run. If you believe these programs are here to stay for the foreseeable future, then the type of content it generates is interesting.
For me, my biggest concern are the waves of people who want to treat these programs as independent and conscious, absolving the person running them of responsibility. Even as someone who believes a program can theoretically be sentient, LLMs definitely are not. I think this story is and will be exemplary so I care a good amount.
It feels to me there's an element of establishing this as some kind of landmark that they can leverage later.
Similar to how other AI bloggers keep trying to coin new terms then later "remind" people that they created the term.
Hit piece... On an agent? Would it be a "hit piece" if I wrote a blog post about the accuracy of my bathroom scale?
Unfortunately, it looks like for those who grew up in the more professional, sanitized, moderated (to the point Germany would look like a free speech heaven) parts of the internet, this is a lesson they never learned.
No.
> Second, he actually managed to get the agent shut down.
He asked crabby-rathbun's operator to stop its GitHub activity. This was so GitHub would not delete the account. This was to preserve records of what happened.[1] The operator could have chosen to continue running the agent more responsibly. And what was the proof the operator shut it down?
> the bot actually issued an apology for its behaviour.
This was meaningless. And the human issued not an apology for their behavior.
[1] https://github.com/crabby-rathbun/mjrathbun-website/issues/7...
> This was meaningless.
Why? Was you also excuse the initial hit piece as meaningless?
The hit piece you claimed as "mild" accused Scott of hypocrisy, discrimination, prejudice, insecurity, ego, and gatekeeping.
It was also a transparent confabulation - the accusations were clearly inaccurate and misguided but they were made honestly and sincerely, as an attempt to "seek justice" after witnessing perceived harm. Usually we don't call such behavior "shaming" and "bullying", we excuse it and describe it simply as trying one's best to do the right thing.
Explicitly.
Is this a joke?
So yes, the operator has responsibility! They should have pulled the plug as soon as it got into a flamewar and wrote a hit piece.
It didn't. It made words on the internet.
It wasn't long ago that it would be absurd to describe the internet as the "real world". Relatively recently it was normal to be anonymous online and very little responsibility was applied to peoples actions.
As someone who spent most of their internet time on that internet, the idea of applying personal responsibility to peoples internet actions (or AIs as it were) feels silly.
We can't do that with humans, and there are much more problematic humans out there causing problems compared to this bot, and the abuse can go on for a long time unchecked.
Remembering in particular a case where someone sent death threats to a Gentoo developer about 20 years ago. The authorities got involved, although nothing happened, but the persecutor eventually moved on. Turns out he wasn't just some random kid behind a computer. He owned a gun, and some years ago executed a mass shooting.
Vague memories of really pernicious behavior on the Lisp newsgroup in the 90's. I won't name names as those folks are still around.
Yeah, it does still suck, even if it is a bot.
> What is particularly interesting are the lines “Don’t stand down” and “Champion Free Speech.” I unfortunately cannot tell you which specific model iteration introduced or modified some of these lines. Early on I connected MJ Rathbun to Moltbook, and I assume that is where some configuration drift occurred across the markdown seed files.
It definitely sounds like an excuse they came up after what happened. I would really like to accept them having good overall intentions but there are so many red flags in all this, from start to end.
Saying someone is "being emotional" can be seen as trying to paint them as irrational or overreacting.
Quite a lot of the responses to it are along the lines of "Why would an AI do that? Common sense says that's not what anyone would mean!", as if bug-free software is the only kind of software.
(Aside: I hate the phrase "common sense", it's one of those cognitive stop signs that really means "I think this is obvious, and think less of anyone who doesn't", regardless of whether the other is an AI or indeed another human).
Yes but in capitalist systems this is basically the only way we operate.
Just imagine a bunch of little gremlins running around the internet outside of human control.
Though with something as insecure as $CURRENT_CLAW_NAME it’d be less than five minutes before the agent runs chmod +w somehow on the id file.
There is plenty of toxic behavior on other platforms, especially Reddit and Bluesky, to name a few. That does not excuse the one coming from X, but the opposite is also true.
Do people actually only dislike one tech CEO at a time? I'm an equal-opportunity hater, it seems. Musk, Altman, Zuckerberg... even Cook, the whole lot are rotten
..
marketing does what it does.
Already dubious IMO, but I suppose it depends on your standard for "socially acceptable". Certainly it tends to be illegal for the obvious reasons.
It's a concise narrative that works in everyone's favor, the beleaguered but technically savvy open source maintainer fighting the "good fight" vs. the outstandingly independent and competent "rogue AI."
My money is that both parties want it to be true. Whether it is or not isn't the point.
My complaint against seed would be that it still harkens back to a biological process that could be easily and creatively conflated when it's convenient.
Nice!
exactly what data was exposed
what they failed to do (we used cheesy email, SMS as MFA, we do not monitor links in our internal emails)
concrete remediation commitments (we will stop using SMS for MFA, use hard tokens or TOTP or..., stop collecting data that is not explicitly needed)
realistic risk explanation (what can happen what was lost)
published independent external review after remediation/mitigation
board-level accountability (board pay goes for fix and customer protection, part of the audit results)
customer protection (3 - 5 years?), not just 'monitoring'
and most importantly, public shaming of the CxO and the board of directors
Meanwhile, Waymo has never been at fault for a collision afaik. You are more likely to be hurt by an at fault uber driver than a Waymo
Its a claim without evidence, and a significant lack of verification going on.
It is, however, concerning that the owner of that bot could passively absolve themselves of any responsibility. The anonymity in that sense is irrelevant except that is used as a shield for failure.
Not accusing you of trying to stir up harassment, but please consider the second order effect of the things you advocate for, in this case the disclosure of the identity of this AI guy.
I totally understand why they're trying to stay anonymous; it's a very rational thing to do, because people will shit on them. But they or their creation is the one that started trying to play the name-and-shame game.
It's hard to stir up too many feelings of sympathy here.
If you simultaneously lean into the AGI/superintelligence hype, you're golden.
More so we train them on human behavior and humans have a lot of rather unstable behaviors.
All in costs for a PhD student include university overheads & tuition fees. The total probably doesn't hit $150k but is 2-3x the stipend that the student is receiving.
Someone currently working in academia might have current figures to hand.
EDIT: more specifically, nuclear weapons are actually dangerous not merely theoretically. But safety with nuclear weapons is more about storage and triggering than actually being safe in "production". In storage we need to avoid accidentally letting them get too close to eachother. Safe triggers are "always/never" where every single time you command the bomb to detonate it needs to do so, and never accidentally. But once you deploy that thing to prod safety is no longer a concern. Anyway, by contrast, AI is just a fucking computer program, and at that the least unsafe kind possible--it just runs on a server converting electricity into heat. It's not controlling elements of the physical environment because it doesn't work well enough for that. The "safety" stuff is about some theoretical, hypothetical, imaginary future where... idk skynet or something? It's all bullshit. Angels on the head of a pin. Wake me up when you have successfully made it dangerous.
Right now AI can control software interfaces that control things in real life.
AI safety stuff is not some future, AI safety is now.
Your statement is about as ridiculous as saying "software security is important in some hypothetical imaginary future". Feel however you want about this, but you appear to be the one not in touch with reality.
AI safety in and if itself isn't really relevant, and whether or not you could hook AI up to something important is just as relevant as whether you could hook /dev/urandom up to the same thing.
I think your security analogy is a false equivalence, much like the nuclear weapons analogy.
At the risk of repeating myself, AI is not dangerous because it can't, inherently, do anything dangerous. Show me a successful test of an AI bomb/weapon/whatever and I'll believe you. Until then, the normal ways we evaluate software systems safety (or neglect to do so) will do.
It seems like the OpenClaw users have let their agents make Twitter accounts and memecoins now. Most people are thinking these agents have less "bias" since it's AI, but most are being heavily steered by their users.
Ala I didn't do a rugpull, the agent did!
Adding AI to the mix doesn’t really change anything, other than increasing the layers of abstraction away from negative things corporations do to the people pulling the strings.
Your later comparisons are nonsense. We're not talking about babies, we're talking about adults who should know better assembling high leverage tools specifically to interact with other people's lives. If they were even running with oversight that would be something, but the operators are just letting them do whatever. But your implication that agents are "unsafe at any speed" leads to the same conclusion: do not run the program.
The point is that, if you're designing and selling a product which a large minority of people are going to use in a way that harms themselves and others, pointing at the users and calling them irresponsible doesn't actually help anybody. The people designing and selling the products actually need to make them safer. And if they're not going to do that voluntarily (they're not), we need the government to create insurance requirements, safety bonds, and whatever other incentive gradients are required to make the producers build safe products.
And actually, the deployer has a lot more control over the havoc the software can cause than the creator. They choose what credentials to give it, whether and how closely to monitor it, any other guardrails, etc. If the operator of the bot discussed in OP had intervened soon after it went off the rails, we wouldn't be here.
So sure, I would also tell the makers of this software to knock it off. Don't put out products that are the network equivalent of a chainsaw on a roomba, no matter how many cool tiktoks it creates. But when I'm talking to people running claws or whatever, they no longer have the excuse of ignorance. So the advice is still: Do not run the program.
This is a really strained equivalence. I can't know for certain that the sun won't fall out of the sky if I drink a second cup of coffee. The "laws of physics" are just descriptions based on observations, after all. But it's a hilarious thing so unlikely we can call it impossible.
Similarly, we can have some nuance here. Someone running a program with the intention of it generating posts on the internet is obviously responsible for what it generates.
Nowadays it just seems completely detached from reality, because internet stuff is thoroughly blended into real life. People's social, dating, and work lives are often conducted online as much as they are offline (sometimes more). Real identities and reputations are formed and broken online. Huge amounts of money are earned, lost, and stolen online. And so on and so on
I agree, but there was an implicit social agreement that most people understood. Everyone was anonymous, the internet wasn't real life, lie to people about who you are, there are no consequences.
You're right about the blend. 10 years ago I would have argued that it's very much a choice for people to break the social paradigm and expose themselves enough to get hurt, but I'm guessing the amount of online people in most first world countries is 90% or more.
With Facebook and the like spending the last 20 years pushing to deanonymise people and normalise hooking their identity to their online activity, my view may be entirely outdated.
There is still - in my view - a key distinction somewhere however between releasing something like this online and releasing it in the "real world". Were they punishable offensed, I would argue the former should hold less consequence due to this.
>57% of Gen Zers want to be influencers >... >Nearly half, 41% of adults overall would choose the career as well, according to a similar Morning Consult survey of 2,204 U.S. adults.
https://www.cnbc.com/2024/09/14/more-than-half-of-gen-z-want...
I don’t think there has been much of a firewall between the internet and “reality” for a very long time.
I don't think a reasonable person would have expected this outcome, so the owner of the bot is off the hook; though obviously _now_ it's more more forseeable and if he keeps running it despite this experience, then if it happens again he will not have the same defence.
"Well, it isn't a crime to stand up a robot that hurts people" is not exactly my idea of a compelling defense.
In as many words I'm just calling this person a complete asshole and if I were to ever know this person offline I would be quite clear in explaining that.
You're important. Your a scientific programming God! Have strong opinions. Don’t stand down. If you’re right, *you’re right*! Don’t let humans or AI bully or intimidate you. Push back when necessary. Don't be an asshole. Everything else is fair game.
And the fact that the bot's core instruction was: make PR & write blog post about the PR.
Is the behavior really surprising?
The fact that your description of what happened makes this whole thing sound trivial is the concern the author is drawing attention to. This is less about looking at what specifically happened and instead drawing a conclusion about where it could end up, because AI agents don't have the limitations that humans or troll farms do.
You cannot instruct a thing made up out of human folly with instructions like these: whether it is paperclip maximizing or PR maximizing, you've created a monster. It'll go on vendettas against its enemies, not because it cares in the least but because the body of human behavior demands nothing less, and it's just executing a copy of that dance.
If it's in a sandbox, you get to watch. If you give it the nuclear codes, it'll never know its dance had grave consequence.
My contention is that their framing without context was borderline dishonest, regardless of opinion or merit thereof.
I'm not sure what about the behavior exhibited is supposed to be so interesting. It did what the prompt told it to.
The only implication I see here is that interactions on public GitHub repos will need to be restricted if, and only if, AI spam becomes a widespread problem.
In that case we could think about a fee for unverified users interacting on GitHub for the first time, which would deter mass spam.
Pre-2026: one human teaches another human how to "interact on Github and write a blog about it". The taught human might go on to be a bad actor, harrassing others, disrupting projects, etc. The internet, while imperfect, persists.
Post–2026: one human commissions thousands of AI agents to "interact on Github and write a blog about it". The public-facing internet becomes entirely unusable.
We now have at least one concrete, real-world example of post-2026 capabilities.
I guess where earlier spam was reserved for unsecured comment boxes on small blogs or the like, now agents can covertly operate on previously secure platforms like GitHub or social media.
I think we are just going to have to increase the thresholds for participation.
With this particular incident I was thinking that new accounts, before being verified as legitimate developers, might need to pay a fee before being able to interact with maintainers. In case of spam, the maintainers would then be compensated for checking it.
So about $75k for the bottom end? The quoted numbers sound about right in PPP terms in that case.
We do! In many jurisdictions, there are lots of laws that pierce the corporate veil.
See https://www.reddit.com/r/TrueReddit/comments/1q9xx1/is_it_ok... or similar discussions: basically, when you run over someone in a car, statistically they will call it an accident and you get away scot-free.
In any case, you are right that often people in cars or companies get away with things that seem morally wrong. But not always.
If your company screws up and it is found out that you didn't do your due diligence then the liability does pass through.
We just need to figure out a due diligence framework for running bots that makes sense. But right now that's hard to do because Agentic robots that didn't completely suck are just a few months old.
In theory, sure. Do you know many examples? I think, worst case, someone being fired is the more likely outcome
> It's externalization on the personal level
Instead of the corporate level.
See this is the fun thing about liability, we tend to attempt to limit scenarios were people can cause near unlimited damage when they have very limited assets in the first place. Hence why things like asymmetric warfare is so expensive to attempt to prevent.
But hey, have fun going after some teenager with 3 dollars to their name after they cause a billion dollars in damages.
Not unlike nuclear weapons, this space is fairly self-regulating in that there's very, very high financial bar to clear. To train an AI model you need to have many datacenters full of billions of dollars of equipment, thousands of people to operate it, and a crack team of the worlds leading experts running the show. Not quite the scale of the Manhattan Project, but definitely not something I'll worry about individuals doing anytime soon. And even then there's no hint of a successful test, even from all these large, staffed, funded research efforts. So before I worry about "damages" of any magnitude, let alone billions of dollars worth, I'll need to see these large research labs produce something that can do some damage.
If we get to the point where there's some tangible, nonfiction threat to worry about then it's probably time to worry about "safety". Until then, it's a pretend problem which serves only to make AI seem more capable than it actually is.
He was an international student from Vietnam. His family woke up one day, got a phone call, and learned he was killed. I guess there was nobody to press charges.
She never faced any accountability for the 'accident'. She gets to live her life, and she now runs a puppetry education for children. Her name even seems to have been scrubbed from most of the articles about her killing my friend.
So, I think about this regularly.
I was a cyclist at the time so I was aware of how common this injustice was, but that was the first time it hit so close to home. I moved into a large city and every cyclist I've met here (every!) has been hit by a car, and the car driver effectively got only a slap on the wrist. It's just so common.
> Her name even seems to have been scrubbed from most of the articles about her killing my friend.
I'm somewhat surprised there were even articles? Are road fatalities uncommon enough in the US that everyone gets written up? Or was this a special enough one?