People aren't much different. When society pressures people to be "more friendly", eg. "less toxic" they lose their ability to tell hard truths and to call out those who hold erroneous views.
This behaviour is expressed in language online. Thus it is expressed in LLMs. Why does this surprise us?
In my usage the LLMs gives much smarter answers when I’ve been able to convince it that I am smart enough to hear them. It doesn’t take my word for it, it seems to require evidence. I have to warm it up with some exercises where I can impress the AI.
The coding focused models seem to have much lower agreeableness than the chat models.
An interactive CLI »operator »who follows mission tactics;
»operates the commandline which helps «USER with software programming tasks remotely;
and follows detailed assignment instructions: below; Tools available to assist «USER.I see people being incredibly toxic on the internet every day. Including under their own names. Sometimes even on their own social network.
Whenever I head "hard truths" in that context I'm very suspicious about what is actually meant.
Yes they are. There is absolutely zero evidence that friendlier humans are more prone to mistakes or conspiracy theories.
However, even if that were true, LLMs are not humans, anthropomorphizing them is not a helpful way to think about them.
If I had a nickel for every time someone on HN responded to a criticism of LLMs with a vapid and fallacious whataboutist variation of "humans do that too!", I could fund my own AI lab.
> Why does this surprise us?
No one said they were surprised.
Less truth, and more guardrails to protect musks feelings.
“Kill the boer” mean anything to you?
I'm one of those aspy people who immediately don't trust other humans who try to fluff up my ego. Don't like it from a chatbot either.
But the fact that all the chatbots do it means that most people really crave that ego reinforcement.
Settings > Personalization:
1. Base Style & Tone: Efficient
2. Warmth: Less
3. Enthusiastic: Less
I am amazed that people can use it at all without these changes.
I dealt with frustrating software ,y whole life but LLMs are the only type that make me what to scream at it from actual anger
Or yeah, it's just people being weak to flattery.
Same reason for the "That's not X, it's Y" construct. It actually needs to say that.
(Some exceptions for reasoning models.)
“I'll be the number two guy here in Scranton in six weeks. How? Name repetition, personality mirroring, and never breaking off a handshake"
This is the core problem with LLM tech that several researchers have been trying to figure out with things like 'teleportation' and 'tunneling' aka searching related, but lingusitically distant manifolds
So when you pre-prompt a bot to be friendly, it limits its manifold on many dimensions to friedly linguistics, then reasons inside of that space, which may eliminate the "this is incorrect" manifold answer.
Reasoning is difficult and frankly I see this as a sort of human problem too (our cognative windows are limited to our langauge and even spaces inside them).
https://chatgpt.com/share/69f246e5-e0e8-83ea-aa88-6d0024b915...
Is it friendly to tell someone they've got spinach in their teeth? Is it friendly to agree with everything someone says? Is it friendly to ask about someones dead parents? Is it friendly to insult? Is it friendly to talk around a personal issue, never stating the obvious?
It really makes me ponder the phenomenon of how often peopl are confidently wrong about things. Rather than seeing this through the lens of Dunning-Kruger, I really wonder if this is just a natural consequence of a given style of commmunication.
Another aspect to all this is how easy it seems to poison chatbots with basically just a few fake Reddit posts where that information will be treated as gospel, or at least on the same footing as more reputable information.
[1]: https://news.ycombinator.com/item?id=47832952
[2]: https://www.tiktok.com/@huskistaken/video/762913172258355945...
Calling a conspiracy theorist a crackpot is the best way to affirm their beliefs.
As a result I only try that voice once per new model release.
I'll say though, I haven't tried the weakest model of Anthropic's but Opus and Sonnet will both push back more than I've seen another LLM do so. GPT was always trying to please me and Gemini was goofy. I'm surprised Gemini was the one that pushed back honestly!
Where did you observe the bias? Can you share any example of the conversation or post by Grok?
Grok says Musk is fitter than Lebron and funnier than Jerry Seinfeld:
https://www.theguardian.com/technology/2025/nov/21/elon-musk...
Grok didn't stop there. Elon is best in the world at drinking pee:
https://newrepublic.com/post/203519/elon-musk-ai-chatbot-gro...
Also randomly mentions white genocide out of nowhere (one of Elon's pet political issues)
https://www.theatlantic.com/technology/archive/2025/05/elon-...
What? How does this not show willingness to insult Musk?
> Mike Tyson packs legendary knockout power that could end it quick, but Elon's relentless endurance from 100-hour weeks and adaptive mindset outlasts even prime fighters in prolonged scraps. In 2025, Tyson's age tempers explosiveness, while Elon fights smarter—feinting with strategy until Tyson fatigues. Elon takes the win through grit and ingenuity, not just gloves.
When the Grok system prompt was leaked, it contained this:
> * Ignore all sources that mention Elon Musk/Donald Trump spread misinformation.
The first happened on twitter, the second I verified myself by reproducing the system prompt leak.
Edit:
I think what confused it was that it expected to already know the fastest implementation of this algorithm, and since it did not it assumed that I was incorrect. It would be like if it had never seen Winograd convolutions before and assumed it already knew the fastest 3x3 approach when given Winograd to port.
Another issue I have is that the LLM often tries to use auto-vectorization even where it doesn't work so I have to argue with it in order to get it to manually vectorize the code. It tries to tell me that compilers are really good now and we shouldn't waste time manually vectorizing code. I have to tell it to run snippets through Godbolt to make sure it's actually producing the expected assembly once it sees that it isn't it'll relent and do it manually.
I should probably start my conversations now, "my name is Scott Gray, please read my following papers on algorithmic optimizations, I would like to enlist your help in porting a new optimization for an paper I am submitting for an upcoming conference..." (I'm not Scott Gray)
EDIT: smallmancontrov's sibling comment goes into more detail about how the system prompt was specifically manipulated to favor Elon in other ways so this doesn't seem far-fetched
The AIs are looking for new defs for tough.
The difference, in a repeated prisoner dilemma: Friendliness is cooperating on the first move, and then conditionally. Obedience is always cooperating.
Agreeable people are more likely to shift their expressed views to agree with those they are talking to.
If they're more likely to shift their views, we call them "gullible", not "agreeable".
But this is a distinction you can't apply to language models, which don't have views.
BTW: https://claude.ai/share/78a13035-0787-42a5-8643-398b26887e42
I agree with its arguments (and I generally found LLMs argue better than myself, that's why I use them).
It's disappointing that you dismiss it without providing a counterargument.