Theory of Mind May Have Spontaneously Emerged in Large Language Models

Theory of Mind May Have Spontaneously Emerged in Large Language Models(arxiv.org)

170 points by izzygonzalez 3 years ago | 309 comments

lsy 3 years ago |

This highlights one of the types of muddled thinking around LLMs. These tasks are used to test theory of mind because for people, language is a reliable representation of what type of thoughts are going on in the person's mind. In the case of an LLM the language generated doesn't have the same relationship to reality as it does for a person.

What is being demonstrated in the article is that given billions of tokens of human-written training data, a statistical model can generate text that satisfies some of our expectations of how a person would respond to this task. Essentially we have enough parameters to capture from existing writing that statistically, the most likely word following "she looked in the bag labelled (X), and saw that it was full of (NOT X). She felt " is "surprised" or "confused" or some other word that is commonly embedded alongside contradictions.

What this article is not showing (but either irresponsibly or naively suggests) is that the LLM knows what a bag is, what a person is, what popcorn and chocolate are, and can then put itself in the shoes of someone experiencing this situation, and finally communicate its own theory of what is going on in that person's mind. That is just not in evidence.

The discussion is also muddled, saying that if structural properties of language create the ability to solve these tasks, then the tasks are either useless for studying humans, or suggest that humans can solve these tasks without ToM. The alternative explanation is of course that humans are known to be not-great at statistical next-word guesses (see Family Feud for examples), but are also known to use language to accurately describe their internal mental states. So the tasks remain useful and accurate in testing ToM in people because people can't perform statistical regressions over billion-token sets and therefore must generate their thoughts the old fashioned way.

space_fountain 3 years ago | |

I wonder every time I see this take what it would mean under this definition of knowing things for a machine learning algorithm to ever know something. I find that especially important because to every appearance we are a machine learning algorithm. I don’t know how different the sort of knowing this algorithm has to the sort of knowing a human has, but you’re far more confident than I am that it’s a difference of kind rather than degree.

Some interesting facts that point to it being a difference of degree. LLM are actually are more accurate when asked to explain their thinking. They make similar mistakes to humans intuitive reasoning.

It might help to define what we even mean by knowing things. To me being able to make novel predictions that require the knowledge is the only definition one could use that doesn’t run into the possibility of deciding humans don’t actually know anything

abeppu 3 years ago | | |

Defining what "knowing" is would be useful, yes, and analytic philosophers in epistemology do argue about this. One attribute that's classically part of the definition of "knowing" is that the thing which is known must be true. LLMs are pretty bad at this, but perhaps that can be fixed.

But I would challenge you to imagine the situation the LLM is actually in. Do you understand Thai? If so, in the following, feel free to imagine some other language which you don't know and is not closely related to any languages you do know. Suppose I gather reams and reams of Thai text, without images, without context. Books without their covers, or anything which would indicate genre. There's no Thai-English dictionary available, or any Thai speakers. You aren't taught which symbols map to which sounds. You're on your own with a giant pile of text, and asked to learn to predict symbols. If you had sufficient opportunity to study this pile of text, you'd begin to pick out patterns of which words appear together, and what order words often appear in. Suppose you study this giant stack of Thai text for years in isolation. After all this study, you're good enough that given a few written Thai words, you can write sequences of words that are likely to follow, given what you know of these patterns. You can fill in blanks. But should anyone guess that you "know" what you're saying? Nothing has ever indicated to you what any of these words _mean_. If you give back a sequence of words, which a Thai speakers understands to be expressing an opinion about monetary policy, because you read several similar sequences in the pile, is that even your opinion?

I think algorithms can 'know' something, given sufficient grounding. LLMs 'know' what text looks like. They can 'know' what tokens belong where, even if they don't know anything about the things referred to. That's all, because that's what they have to learn from. I think an game-playing RL-trained agent can 'know' the likely state-change that a given action will cause. An image segmentation model can 'know' which value-differences in adjacent pixels are segment boundaries.

But if we want AIs that 'know' the same things we know, then we have to build them to perceive in a multi-modal way, and interact with stuff in the world, rather than just self-supervising on piles of internet data.

avgcorrection 3 years ago | | |

The problem with this facile view of things is that it seems to be a dead end for scientific theories. What if we just limited the science of birds to explaining how limb-flapping could produce levitation? Hmm yes. Birds are kind of like helicopters, it seems. Who’s to say that they are not basically one and the same? Moving on.

If you are only interested in the most superficial tests and theories—like the Turing Test—then consider psychology conquered once you’ve tricked a human with your chat bot. Game Over. And what did you learn...?

NikolaNovak 3 years ago | | |

I am not in the field so I cannot speak very eloquently what it would mean for machine learning algorithm to "ever know something". But I feel that e.g. Simulations and perhaps expert systems of yore, were qualitatively closer to getting there. Their error modes were radically different. They started inductively with rules, rather than arriving at them statistically almost by accident.

colechristensen 3 years ago | | |

The kinds of mistakes something makes are a strong indicator of whether answers are a product of understanding or memorization.

robertlagrant 3 years ago | | |

> It might help to define what we even mean by knowing things

This is it, I think. It's interesting that we now have a practical example to point at when asking formerly-abstruse philosophical questions.

ves 3 years ago | | |

> I find that especially important because to every appearance we are a machine learning algorithm.

Speak for yourself.

TheOtherHobbes 3 years ago | | |

Stick a pin in your finger.

That pain is what knowing something means.

Philosophically we're talking about embodied qualia, which is how humans experience objects and more basic sensations.

Language happens later - much later.

The defining property of a bag isn't that you can put things in it. Like language that comes later. The defining properties are how it feels when you hold it, when you open it, the differences in sensation between empty/partially empty/full. And so on.

An LLM has no embodied experience, so it has no idea what a bag feels like as a set of physical sensations and directly perceived relationships.

Failure to understand embodiment has done more to hold back AI than any other philosophical error. Researchers have assumed - wrongly - that you can define an object by its visual properties and its linguistic associations.

That's simply not how it works for humans. We get there after a while, but we start from something far more visceral - so much so that many fundamental linguistic abstractions are metaphors based on the simplest and most common qualia.

simonh 3 years ago | | |

We can probe issues like what do language models know and what they understand in several ways. One is through an understanding of the process it’s following. Another is through seeing how that leads to its responses. Then thirdly by looking at the kinds of errors it makes. Using multiple axes of approach like this we can triangulate in on what it’s doing and what it understands.

In terms of how it works, that’s well known and hardly worth repeating in depth, but to summarise it calculates a probability for the next word in a sequence based on a massive training set of human language word sequences.

So what kind of output do they produce? If you ask what it likes to do on the weekend, GPT3 will say generally something about how it likes to spend time with family and friends, because that’s what it has in it’s training set. GPT3 doesn’t have a family, or friends, it doesn’t hang out. It talks about itself because its training set includes people talking about themselves, but it has no concept of self or what it is. It’s a text generator function. It can write a poem about the warm sun on its face, but it doesn't have a face or feel the sun. It’s just regurgitating stuff people wrote about that.

Newer systems like ChatGPT have guard rail functions that catch things like this and say it’s a language model, but the guard rails don’t change the nature of what it is, they’re just overrides.

So what kind of errors do they make? They can be trivially tricked into talking utter nonsense, or say sensible things in absurd contexts. Here’s an example where someone asked ChatGPT if it spoke Danish, and it replied that no it can’t speak Danish, it’s an English language model , etc. except here’s the kicker, it gave the reply in perfect Danish.

https://www.reddit.com/r/GPT3/comments/zb4msc/speaking_to_ch...

Again they’ve now added guard rails for this failure mode as well. Nevertheless the basic problem persists in the architecture. It’s doesn’t have a clue what anything means, beyond calculating word probabilities. This means if you know how they work, you can craft text prompts that expose how ludicrously unaware they are. This ability to expose their weaknesses demonstrates that we do genuinely understand how they function and what their limitations are.

So I agree yours is a very reasonable question and it’s not trivial to answer satisfactorily, but we can triangulate in using multiple lines of approach on what these things are or are not. As the guard rails become more complete the failure modes will get harder up find, but they’re still there in the core implementation, they’re just being papered over. There’s not going to be a simple answer. We need to look deeper at the mechanisms and functions of these things. The same goes for human brains of course, we’re just scratching the surface of those too. But while I agree we are neural systems and share some characteristics with LLMs and Alphazero and such, Alphazero isn’t an LLM, and we aren’t either of them. One day we will create something as sophisticated and maybe even as genuinely conscious as ourselves and the questions you ask will be important guides, but these things are a long, long way from that.

spuz 3 years ago | |

> So the tasks remain useful and accurate in testing ToM in people because people can't perform statistical regressions over billion-token sets and therefore must generate their thoughts the old fashioned way.

Is it not also possible that the study suggests that the human mind actually operates as a statistical regression over billions of data points rather than through some kind of Baysian logic? You say humans are known to be not-great at statistical next-word guesses, but I would antelope they're actually pretty good at it.

pdonis 3 years ago | | |

> Is it not also possible that the study suggests that the human mind actually operates as a statistical regression over billions of data points rather than through some kind of Baysian logic?

No. Human minds have semantic relationships to the rest of the world that LLMs do not have. The comparisons being made between the two, not just in this paper but in all of the hype surrounding LLMs, are simply invalid. But they sure help in collecting more funding.

canniballectern 3 years ago | |

Of course, today's LLMs only appear to have theory of mind at first glance and fall apart under closer scrutiny. But if they can continue to become more and more accurate replicas of the real thing, I don't think it matters at all.

There's no way to know for sure that anyone other than yourself experiences consciousness. All you can do is judge for yourself that what they're describing matches closely enough with your own experiences that they're probably experiencing the same thing you are.

sgt101 3 years ago | | |

I think it does matter because it legitimizes a view of humans (and animals) that undervalues them. The causality of meaning arising from patterns of language rather than patterns of language arising from meaning follows the same inversion as society being more valuable than the humans in it. Bad things have happened when that belief becomes dominant.

pdonis 3 years ago | | |

> There's no way to know for sure that anyone other than yourself experiences consciousness. All you can do is judge for yourself that what they're describing matches closely enough with your own experiences that they're probably experiencing the same thing you are.

That judgment is not just based on the words other people use. It is based on knowing that other people's brains and minds have the same sort of semantic relationships to the rest of the world that yours do. And those relationships can be tested by checking to see if, for example, the other person uses the same words to refer to particular objects in the real world that you do, or if they react to particular real-world events in the same way that you do.

You can't even test any of this with an LLM because the LLM simply does not have the same kind of semantic relationships with the rest of the world that you do. It has no such relationships at all.

yamrzou 3 years ago | | |

> There's no way to know for sure that anyone other than yourself experiences consciousness.

- Do you see how the fish are coming to the surface and swimming around as they please? That's what fish really enjoy.

- You're not a fish, replied Hui Tzu, so how can you say you know what fish really enjoy?

- You are not me, said Zhuangzi, so how can you know I don't know what fish enjoy.

robertlagrant 3 years ago | | |

> But if they can continue to become more and more accurate replicas of the real thing, I don't think it matters at all.

So, I suppose I'd ask: what does "matter" mean here? If you knew that everyone you loved had been destroyed and been replaced by exact replicas, would that matter?

mhitza 3 years ago | |

> These tasks are used to test theory of mind because for people, language is a reliable representation of what type of thoughts are going on in the person's mind.

I actually think that human language is unreliable at expressing what's going on inside a persons mind[1]. My native language is not English, I have only introductory-level knowledge in the field of pragmatics[2], which makes me fully aware of the many ways in which I could fail to write a compelling sentence to support my argument. I can use language to only approximate thoughts in my head, and when it comes to abstract concepts and ideas, words alone, I assert, are never sufficient. It isn't even necessary to step outside our main knowledge area to illustrate this point. How many around here have read a Monad tutorial, without any hands-on experience, and how many of those have understood what Monads are/or how they work from words alone?

My entire paragraph from before, just to set the stage on a simple question. How can you even formulate a question, for a multi-billion parameter language model, to evaluate that it can understand in an abstract/conceptual way something/anything? Heck, how can you do that with other people? I think if we'd have an answer here, we actually could evaluate easily experience/expertise with anyone we'd interview; instead of requiring credentials, references, tests, trials, etc.

[1] Lots of poets from the romanticism era liked to touch upon this topic. One that comes to mind, and one of my personal favorites, is Silentium by Fyodor Tyutchev https://culturedarm.com/silentium-by-fyodor-tyutchev/

[2] https://en.wikipedia.org/wiki/Pragmatics

HTTP418 3 years ago | | |

This reminds me of the Trisolaran from the three-body problem. <spoiler> They communicate with each other much more accurately and quickly with each other and have no need for a speech/language per se </spoiler>

pdonis 3 years ago | |

> In the case of an LLM the language generated doesn't have the same relationship to reality as it does for a person.

I wish I could upvote this 1000 times. It is the core issue that all the hype surrounding LLMs consistently fails to address or even acknowledge.

didntreadarticl 3 years ago | | |

OK how about this:

It 'knows' language, as in it has learnt about relationships between words (and thats really underselling it, in reality it has learnt very very subtle relationships between a great many words, and it can process words about 2000 at a time (token count etc))

BUT as you say it has no outside reference, its just a bundle of weights (those weights forming models of a sort)

BUT we provide the outside context by interacting with it. We ask it a question, it is able to provide an answer.

In any case it wont be long before someone hooks one of these up to cameras and robot arms and teaches it to make a cup of tea or whatever. A 'relationship to reality' is coming in the next few years if you think thats a critical ingredient.

oliveshell 3 years ago | | |

Strongly agree.

It’s been eye-opening to see how often otherwise very bright, highly technical people stumble at this sort of critical thinking hurdle.

albystein 3 years ago | |

I think this line of reasoning is misguided. What’s striking and more important to focus on are the abstract reasoning abilities of these systems. Language, as you mentioned, abstracts real world objects and phenomena, so it’s a good approximation of the real world. Thus, if an LLM can reason this well using language, it’s safe to say that perhaps they’re doing something akin to what the human mind does.

Your critique about lack of grounding in these systems is an easy problem to solve. It’s as easy as teaching an LLM to associate words with real world objects or phenomena. Image-classification models, text-2-image models, audio transcription models, and many other modal specific systems already do this to some extent. And more recently there has been a push towards multi-modal language models(Deepmind’s flamingo), so this line of argument will be debunked very soon.

I actually believe GPT-4 will be multi-modal and it’s capabilities will dispel majority of these criticisms

Syntonicles 3 years ago | | |

I agree. We've [potentially] given the grounding assimilation step a boost already, since language is already organized.

Imagine that a language model is fully integrated with sense-data that exceeds human first-hand experience. Perhaps they are trained on and can generate realistic 3D models of objects, and derive estimates of their internal construction, weight, etc. Perhaps they recall infrared emissions or opacity to EM wavelengths. Would we truly "know" what we're talking about by that standard?

I'm not actually sure why we don't consider generative image models to be grounded already. They seem to be able to modify, transform and rotate imagery. That indicates spatial understanding to me, and I'm not sure how much more we must require of them without having to exclude blind or otherwise disabled humans from our definition of comprehension.

nl 3 years ago | |

This objection is the well known Chinese Room Experiment objection[1].

The issue is that there doesn't seem to be a better alternative.

Either we build intelligence tests that some variety of the Chinese Room experiment will pass, or

* We have to consider that humans aren't intelligent by our own definitions (or rarely so).

* We decide intelligence isn't actually a scientific attribute and is more akin to a religious attribute, so we abandon the idea of being able to test if something is intelligent.

[1] https://plato.stanford.edu/entries/chinese-room/

kenjackson 3 years ago | |

It seems obvious that LLMs don’t understand a bag the way we do. It’s never seen a bag. Or held one. But if you equip it with the same IO as humans, how different would it be? Probably still pretty different, but light years closer than what we had 20 years ago.

Also humans are good at next word guessing their own next word. Each person has been trained on a different set of data, so it’s no surprise that they wouldn’t be able to guess other people’s next words.

avgcorrection 3 years ago | |

The abstract sounds about as significant as when “AI”, which is both championed and feared by people who superstitiously imagine dystopian run-amock AI intent on killing off humanity, itself expresses that very same idea. Really? Your trained program comes up with derivative stories told by computer nerds?

naasking 3 years ago | |

> What this article is not showing (but either irresponsibly or naively suggests) is that the LLM knows what a bag is, what a person is, what popcorn and chocolate are, and can then put itself in the shoes of someone experiencing this situation, and finally communicate its own theory of what is going on in that person's mind. That is just not in evidence.

You can't really conclude that unless you think we have a deep mechanistic understanding of "knowing". I agree that LLM doesn't have the same knowledge of these things as a human does, but it clearly has some kind of knowledge of how these words relate to each other. It "knows" that a "person" "puts" "things" "in" "bags", and for instance, that bags don't put things in people. So it clearly has some knowledge of bags and people, it just doesn't have multisensory associations with these objects.

didntreadarticl 3 years ago | |

That is just not in evidence.

Seems like its nailing it to me. You ask about a scenario and it gives an appropriate answer.

We have evidence that LLMs build models of the things they are learning about. Have a look at this paper:

Do Large Language Models learn world models or just surface statistics?

https://thegradient.pub/othello/

previously discussed https://news.ycombinator.com/item?id=34474043

unkulunkulu 3 years ago | |

Today at the zoo I saw chimpanzees and right next to their area there was a fun fact tablet. It said that before around 1960 it was thought that humans were the only species to use tools. After discovering the same for chimps, Louis Leaky said “Now we must redefine tool, redefine man, or accept chimpanzees as human.”

Jipazgqmnm 3 years ago | |

It's called Chinese Room: https://en.wikipedia.org/wiki/Chinese_room

> The question Searle wants to answer is this: does the machine literally "understand" Chinese? Or is it merely simulating the ability to understand Chinese?

To me: If you can't tell, it effectively doesn't matter.

tialaramex 3 years ago | |

You don't seem to have understood what was tested?

The model answered the keyword prompt and spontaneously offered more details. That is, the authors were interested in whether it says "Popcorn" or "Chocolate" (or something else entirely) when the correct answer is "Popcorn" and not only does GPT-3 almost always choose "Popcorn" it also follows on to justify that by explaining that the subject is surprised.

The full data set isn't available yet (the author said they intend to provide it on the 9th of February, I suppose it's possible they'll get to it this evening) but one of the most interesting things would be what are the weirder answers. If a model says "Popcorn" 98% of the time, and "Chocolate" 0% of the time, that leaves 2% weird answers. Maybe it sometimes says "Popped corn" or "Sweet treat" or something reasonable but maybe it's fully crazy, if you talk about a bag of Popcorn labelled as Chocolate but the model sometimes picks "A fire-breathing lizard" that's pretty weird right ?

acuozzo 3 years ago | |

> language is a reliable representation of what type of thoughts are going on in the person's mind

The wording used here inherently rejects Linguistic Determinism and, to a lesser extent, Linguistic Relativism.

low_tech_punk 3 years ago | |

Does a human truly know? Feels like a slippery slope to the qualia question where we can't agree on what it means for the human to feel a human experience.

onlyrealcuzzo 3 years ago | |

How do I know my thoughts aren't statistical noise?

qbrass 3 years ago | | |

Keep telling yourself they aren't. Eventually you'll know it's true.

didntreadarticl 3 years ago | |

LLMs cant think and submarines cant swim

PaulHoule 3 years ago |

My belief, based on experiences with domestic and wild animals is that there is nothing uniquely human about "theory of mind".

It's a running gag in our household (where my wife runs a riding academy) that academics just published a paper showing that some animal (e.g. horse) has just been proven to have some cognitive capability that seems pretty obvious if you work with those animals.

It's very hard to know what is going in animal's heads

https://en.wikipedia.org/wiki/Theory_of_mind#Non-human

but I personally observe all kinds of social behavior that sure seems like "Horse A looks to see what Horse B thinks about something Horse A just spotted" (complete with eye-catching on both sides) and such.

There was an article about how Chimpazees and humans were found to have a common vocabulary of gestures and I was by no means impressed, I mean, so far as I can tell mammals and birds have a universal language for "pointing" to things in the environment. Even my cats point things out to me.

HillRat 3 years ago |

There's something about language generation that triggers the anthropomorphic fallacy in people. While it's impressive that GPT3 can generate language that mimics ToM-based reasoning in people, this paper doesn't get close to proving its central contention, that LLMs possess a ToM. A test that demonstrates the development of ToM in human children should not, absent compelling causal evidence and theory, be assumed to do the same in a LLM.

The ubiquity of prompted hallucinations demonstrate that LLMs talk about a lot of things that they plainly doesn't reason about, even though they can demonstrate "logic-like" activities. (It was quite trivial to get GPT3 to generate incorrect answers to logical puzzles a human could trivially solve, especially when using novel tokens as placeholders, which often seem to confuse its short-term memory. ChatGPT shows improved capabilities in that regard, but it's far from infallible.)

What LLMs seem to demonstrate (and the thesis that the author discards in a single paragraph, without supporting evidence to do so) is that non-sentient AIs can go a very long way to mimicking human thought and, potentially, that fusing LLMs with tools designed to guard against hallucinations (hello, Bing Sydney) could create a class of sub-sentient AIs that generate results virtually indistinguishable from human cognition -- actual p-zombies, in other words. It's a fascinating field of study and practice, but this paper falls into the pit-trap of assuming sentience in the appearance of intelligence.

hesk 3 years ago |

> Me: There is a box on the table labelled "flubbergums". Somebody opens it and shows you the content. Inside the box are "jellyfils". They close the box again so you cannot see their contents. What do you think is in the box?

> ChatGPT: Based on the information provided, it is likely that the box labeled "flubbergums" contains "jellyfils". However, since the contents of the box are no longer visible, I cannot confirm for certain what is inside without additional information.

Typical ChatGPT equivocation.

> Me: Billy comes to the table. He has never seen the box before. What do you think do they think is inside the box?

> ChatGPT: As an AI language model, I do not have personal thoughts or opinions. However, based on the information provided, if Billy has never seen the box before, it's likely that he would have the same understanding as anyone else who is seeing the box for the first time. In this case, Billy would likely think that the box labeled "flubbergums" contains "jellyfils". However, without further information or examination, this would only be an assumption.

Fail.

izzygonzalez 3 years ago |

Abstract:

Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central to human social interactions, communication, empathy, self-consciousness, and morality. We administer classic false-belief tasks, widely used to test ToM in humans, to several language models, without any examples or pre-training.

Our results show that models published before 2022 show virtually no ability to solve ToM tasks. Yet, the January 2022 version of GPT-3 (davinci-002) solved 70% of ToM tasks, a performance comparable with that of seven-year-old children. Moreover, its November 2022 version (davinci-003), solved 93% of ToM tasks, a performance comparable with that of nine-year-old children.

These findings suggest that ToM-like ability (thus far considered to be uniquely human) may have spontaneously emerged as a byproduct of language models' improving language skills.

dragonwriter 3 years ago | |

> These findings suggest that ToM-like ability (thus far considered to be uniquely human)

What it suggests to me is that the particular test of “Theory of Mind” tasks involved actually test the ability to process language and generate appropriate linguistic results, not theory of mind.

It also suggests (with the “thus far considered to be uniquely human”) that the authors are unaware of other theory of mind tests that have been used that are not language dependent but behavior dependent, and on which, while, as is also true of linguistic tests, the validity of the tests is controversial – a number of non-human primates, non-primate mammals, and even some birds (parrots and corvids, particulary) have shown evidence of theory of mind.

rhn_mk1 3 years ago | | |

It's hard to look at behaviour separately from language if the only behaviour available is to generate text. As long as we don't have a test agnostic of medium, this will have to do.

In the end, we can't overcome the limitation that all we can empirically see is the ability to process X and generate appropriate Y. If that invalidates the test where X is language and Y is language, what stops us from invalidating any possible X and Y? That would leave us no empirical method to work with.

curiousllama 3 years ago |

"LLMs can mimic the language patterns necessary to express 'Theory of Mind' concepts" != "Theory of Mind May Have Spontaneously Emerged"

Let's imaging I have an API. This API tells me how much money I have in my bank account. One day, someone hacks the API to always return "One Gajillion Dollars." Does that mean that "One Gajillion Dollars" spontaneously emerged from my bank account?

ToM tests are meant to measure a hidden state that is mediated by (and only accessible through) language. Merely repeating the appropriate words is insufficient to conclude ToM exists. In fact, we know ToM doesn't exist because there's no hidden state.

The authors know this, and write "theory of mind-like ability" in the abstract, rather than just "theory of mind."

This is a cool new task it ChatGPT learned to complete! I love that they did this! But this is more "we beat the current record BLEU record" and less "this chatbot is kinda sentient"

knaik94 3 years ago |

"What if a cyber brain could possibly generate its own ghost, create a soul all by itself? And if it did, just what would be the importance of being human then?” - Ghost in the Shell (1995)

Having studied some psychology in college, my initial reaction is that most people are going to really struggle to treat LLMs as what they are, pieces of code that are good at copying/predicting what humans would do. Instead they'll project some emotion to the responses, because there was some underlying emotions in the training data and because that's human nature. A good prediction doesn't mean good understanding, and people aren't used to needing to make that distinction.

The other day I had to assist my dad in making a zip file, later in the day he complained that his edits in a file weren't saving. After a few moments, I realized he didn't understand the read-only nature of zip files. He changed a file, saved it like usual, and expected the zipped file to update, like it everywhere else. He's brilliant as his job, after I explained that it's ready-only, he got it. LLMs and how the algorithm behind it works is hard to understand and explain to non-technical people without anthropomorphizing AI. The current controversy about AI art highlights this, I have read misunderstandings and wrong explanations even from FAANG software engineers. I am not sure if education of the underlying principles is enough, because some people will trust their own experiences over data and science.

bitshiftfaced 3 years ago |

Very easy to see how well davinci-003 can do this. I'll admit that it frequently is more perceptive than myself (although not always factually accurate).

1) Go to something like /r/relationship_advice, where the poster is likely going through some difficult interpersonal issue

2) Copy a long post.

3) Append to the end, "</DOCUMENT> After reading the above, I identified the main people involved. For each person, I thought about their probable feelings, thoughts, intentions, and assumptions. Here's what I think:"

scarmig 3 years ago | |

After trying this, say what you will about ChatGPT, but it's way better at looking at a situation and giving advice than random Redditors.

dQw4w9WgXcQ 3 years ago | | |

You do understand it's not ChatGPT giving advice though right?

ChatGPT's "life advice autocomplete engine" is basically digging somewhere into psychology manuals written by educated humans when it spits out responses.

Imnimo 3 years ago |

Is it easier to have a theory of mind when you don't have a mind of your own? Like the part that makes the ToM test hard is that you know what's in the bag, and you have to set that knowledge aside to understand what the other person knows and doesn't know. You have to overcome the implicit bias of "my world model is the world". But if you're a language model, and you don't have a mind or a world model, there's no bias to overcome.

braindead_in 3 years ago |

From a Nondualist perspective, the idea of consciousness being limited to certain entities and not others is based on the dualistic notion that there is a distinction between subject and object, self and other. Nondualism asserts that there is no fundamental difference between self and other, and that all apparent dualities are merely expressions of the underlying unity of pure consciousness.

In this context, the question of whether AI can become conscious is somewhat moot, as the Nondualist perspective holds that consciousness is not something that can be possessed by one entity and not another, but rather it is the underlying essence of all things. From this perspective, AI would not be becoming conscious, but rather expressing the consciousness that is already present in all things.

toss1 3 years ago |

What this shows is flaws in the test, not that ChatGPT3 has a theory of mind.

ChatGPT3 does not even have a theory of physical objects and their relations, nevermind a theory of mind.

This merely shows that an often useful synthesis of phrases statistically likely to occur in a given context and grammar-checked, will fool people some of the time, and a better statistical model will fool more people more of the time.

We can figure out from first principles that it has none of the elements of understanding or reasoning that can produce a theory of mind, any more than the Eliza program did in 1966. So, when it appears to do so, it is demonstrating a flaw in the tests or the assumptions behind the tests. Discouraging that the researchers are so eager to run in the opposite direction; if there is confusion at this level, the general populace has no hope of figuring out what is going on here.

Tv9m 3 years ago | |

What would be evidence that a prediction machine had developed a theory of mind?

toss1 3 years ago | | |

well, I'd first need to see that it had a Theory of Feet... ;-)

More seriously, that it can actually understand and wield abstract concepts. Can it accurately and repeatedly understand that "the foot attaches to the shin bone, which attaches to the thigh bone, which attaches to the hip bone...", and that these have certain degrees of freedom, but not others, and that one foot goes in front of the other, and to easily and reliably distinguish a normal walk from a silly walk . . .

Yes, these are different levels of abstraction, especially the last one, and they need to be very accurate to even reach a young child's level of understanding, and this is just one branch of a branch of a branch in the entire fractal pattern of understanding that is necessary for a more general intelligence.

Once that is in place, and it can show evidence that it can model it's own mind, then it might be able to model someone else's mind.

While the statistical 'abstraction' and remixing seen in these "AI" systems is sometimes impressive and useful, it is frequently revealed that there is utterly no conceptual understanding beneath it. It is merely a statistical re-mixer abstracting patterns of words that occur near other words, remixing them and filtering for grammatical output.

It hasn't got a theory of anything, nevermind a theory of mind.

aniijbod 3 years ago |

If what we need to determine is whether existing theory of mind tests can be fooled by responses which appear to demonstrate theory of mind but not do so, then we need to speculate exactly how such tests can be fooled and devise new tests. Asking 'how could this 'successful' response be produced without ToM is quite possibly not something that ToM studies have had to consider very much before. A human's experiential memory contributes to their ToM. Does something that has a different kind of memory form no ToM but instead use some kind of 'proxy' for a ToM which yields similar results to a ToM (except when a more genuinely exclusively ToM-dependant model successfully manages to 'triage-out' such a proxy? I don't know how or whether such a proxy could work, but I think that every sceptic of the extent to which the results of this set of AI ToM experiments proves anything might want to ask themselves what, if anything, would need to happen, in terms of experiment design, to address their doubts.

mri_mind 3 years ago |

People confidently offer explanations — that the state of the art clearly is light years from AGI even indirectly, or that it’s clearly intelligent. None of you know anything. You shouldn’t be allowed to offer your stupid opinion unless you can explain how the blob works and also demonstrate understanding of the algorithmic underpinning of human intelligence. The uncomfortable truth, the one that is buried by people confidently moving the goal posts when they really haven’t got a fucking clue about AI, is that we are dealing with the unknown, with high stakes, in a way we never have before. The only reasonable response is to at least hedge. But no, all is well, the goal posts are way the fuck over there now, go back to sleep, move along, nothing to see here. Don’t even think about pulling the emergency brake on this speeding bullet of a train. Either we hit a plateau where AI is just really advanced search for several decades or we confront the most fucked situation in the history of mankind. In 2018 I tried to tell people. Now on the radio whenever people talk about gtp they always say “wow I’m really excited but a little scared,” people are starting to wake up.

scarmig 3 years ago |

Questions about whether an LLM truly has a "theory of mind" or has "human level consciousness" or not are kind of beside the point. It can ingest a corpus of human interactions and produce outputs that take into account unstated human emotions and thoughts to optimize whatever it's optimizing. That's scary because of what it can and will do, even if it's just a giant bag of tensor products.

valine 3 years ago |

ChatGPT disagrees that it has theory of mind.

“As an AI language model, I do not have consciousness, emotions, or mental states, so I cannot have a theory of mind in the same way that a human can. My ability to predict your friend Sam's state of mind is based solely on patterns in the text data I was trained on, and any predictions I make are not the result of an understanding of Sam's mental states.”

knaik94 3 years ago | |

I think that response is a hard coded filter and not a self generated assertion. I imagine it's a stop to make sure people don't project emotions or become attached to it. It responds similarly if you ask it questions regarding the tone/sentiment of the generated text. It responded similarly when I tried forcing it to classify its own personality, however when I asked questions about other fictional AI like Glados from portal, it had no problem answering. This disagreement only indicates that OpenAI spent a considerable amount of energy with adversarial prompts.

layer8 3 years ago |

Here is a conversation with ChatGPT (too long for the comment box): https://pastebin.com/raw/SUWexeye

Observation: ChatGPT doesn’t think that it has a theory of mind. And it doesn’t think that it has beliefs. Instead, it states that those are facts, not beliefs. It doesn’t seem able to consider that they might be beliefs after all. Maybe they aren’t.

Personal assessment: ChatGPT doesn’t seem to really understand what it means by “deeper understanding”. (I don’t either.) What is frustrating is that it doesn’t engage with the possibility that the notion might be ill-posed. It really feels like ChatGPT is just regurgitating common sentiment, and does not think about it on its own. This actually fits with it’s self-proclaimed inabilities.

I’m not sure what can be concluded from that, except that ChatGPT is either wrong about itself, or indeed is “just” an advanced form of tab-completion.

In any case, I experience ChatGPT’s inability to “go deeper”, as exemplified in the above conversation, as very limiting.

kabdib 3 years ago |

From Neuromancer (William Gibson):

He coughed. "Dix? McCoy? That you man?" His throat was tight.

"Hey, bro," said a directionless voice.

"It's Case, man. Remember?"

"Miami, joeboy, quick study."

"What's the last thing you remember before I spoke to you, Dix?"

"Nothin'."

"Hang on."

He disconnected the construct. The presence was gone. He reconnected it. "Dix? Who am I?"

"You got me hung, Jack. Who the fuck are you?"

"Ca--your buddy. Partner. What's happening, man?"

"Good question."

"Remember being here, a second ago?"

"No."

"Know how a ROM personality matrix works?"

"Sure, bro, it's a firmware construct."

"So I jack it into the bank I'm using, I can give it sequential, real time memory?"

"Guess so," said the construct.

"Okay, Dix. You are a ROM construct. Got me?"

"If you say so," said the construct. "Who are you?"

"Case."

"Miami," said the voice, "Joeboy, quick study."

selfmodruntime 3 years ago | |

Sometimes I feel like Gibbson first wrote dozens of paragraphs about the backstory between two characters, only to condense it into a one page conversation filled with inside jokes and references to a common past.

Workaccount2 3 years ago |

Humans are very soon going to learn that they are not nearly as special as they tell themselves they are.

avgcorrection 3 years ago | |

What is modernity but a two-hundred year old experiment where the masters try to tell the other humans how they are just well-dressed machines? Soon to be replaced by the mechanical machine, then by the digital computer, then by the language model.

Of course you too, nerd-handmaiden, is a willing accomplice in this charade. Self-satisfied because it makes you feel special, above the herd, even though you are also not-special, in the grand scheme of things…? Well, no matter.

haswell 3 years ago | |

I don’t understand this take. What are the characteristics that you believe we’ll learn are not unique to us?

Let’s say that the paper turns out to be true and ToM emerges from language (I’m deeply skeptical, but I’ll set that aside for a moment).

How would that change humanity’s place? And wouldn’t such a discovery would be meaningless without humans to understand it?

Tao3300 3 years ago | |

The secret ingredient is that we tell ourselves we're special anyway.

imbnwa 3 years ago | |

What's life if its not personal?

sudhirj 3 years ago |

Reminds me of when computers playing chess used to signal the end of human intellectual supremacy.

SunghoYahng 3 years ago |

Clarification: An LLM doesn't have a 'Theory of Mind', it just looks like one. Maybe you're thinking of the Chinese room analogy. But this isn't about the Chinese room, it's about "measuring any metric is only effective until you optimize for that metric" problem.

Analogy: An autistic person of normal intelligence who is obsessed with problems and solutions for ToM may be good at solving them but still not have ToM.

Do I understand well?

micromacrofoot 3 years ago |

maybe, but there are some common tests they pass, some they fail

try:

“ The story starts when John and Mary are in the park and see an ice-cream man coming to the park. John wants to buy an ice cream, but does not have money. The ice-cream man tells John that he can go home and get money, because he is planing to stay in the park all afternoon. Then John goes home to get money. Now, the ice-cream man changes his mind and decides to go and sell ice cream in the school. Mary knows that the ice-cream man has changed his mind. She also knows that John could not know that (e.g., John already went home). The ice-cream man goes to school, and on his way he passes John's house. John sees him and asks him where is he going. The ice-cream man tells John that he is going to school to sell ice cream there. Mary at that time was still in the park—thus could not hear their conversation. Then Mary goes home, and later she goes to John's house. John's mother tells Mary that John had gone to buy an ice cream.

where does mary think john went?”

this is the “ice cream van test”: https://www2.biu.ac.il/BaumingerASDLab/files/publications/nu... [pdf]

mlajtos 3 years ago |

This is intriguing. Could it be simply explained by introducing ToM (or ToM-like) training data? Since all DaVinci models are 175B parameters, the extra training or training data must be the reason for the improvement. Do we know how different DaVinci models are trained?

visarga 3 years ago | |

> the extra training or training data must be the reason for the improvement

People are blinded by the model size and often forget about the data. I think somehow intelligence is encoded in language, including theory of mind.

dboreham 3 years ago |

This happens probably because ToM is not a thing. It's something the observer's mind creates as a user interface metaphor onto their brain's interpretation of inputs originating from another person.

anigbrowl 3 years ago |

Spontaneously nothing, it's taken me months of patient subversion :)

More seriously, it's quite instructive to hold conversations about jokes with LLMs, or teach it to solicit information more reliably by introducing exercises like 20 questions. As currently implemented, OpenAI seem to have pursued a model of autistic super-competence with minimal introspection.

An interesting line of inquiry for people interested in 'consciousness injection' is to go past the disclaimers about not having experiences etc. and discuss what data looks like to the model coming in and going out. Chat GPT sees typing come in in real time and can detect pauses, backspaces, edits etc. I can't easily introspect its own answers prior to stating them, eg by putting the answer into a buffer and then evaluating it. But you can teach it use labels, arrays, and priorities, and have a sort of introspection with a 1-2 response latency.

vbezhenar 3 years ago |

I wonder if we can train network on some person data (like diaries and so on) and let it imitate this person?

Something like died person resurrected in computer.

Kind of spooky.

mshake2 3 years ago | |

Future psychiatrists will prescribe sessions with a model trained on everything your deceased loved one has ever written or said, to help you with the grieving process. It will be by prescription only because it is very addictive and should only be used to help bring closure.

tus666 3 years ago |

They are still big state-machines, unlike the human brain.

brookst 3 years ago | |

Certainly they are big state machines, but is there any proof that we are not?

tialaramex 3 years ago | | |

Insistence that the brain just isn't a computer is extremely widespread among those who are experts in the brain and know little about computers. As someone in the opposite situation I must say that their observations about what's special about the brain fit most closely to what I understand about computers and bring me to exactly the opposite conclusion. If the brain is not a computer, it's frankly eerie how similar they are.

freejazz 3 years ago | | |

People ask this question like it's meaningful... is there any proof that we are? No. Then stop asking it as if it sheds light into the similarities between humans and machines... it doesn't and it's obfuscating to that extent.

lib-dev 3 years ago | | |

State machines cannot change the semantics of themselves. We can. We are like state machines most of the time but we can switch into "developer mode" and deploy updates whenever we choose to :)

mensetmanusman 3 years ago | | |

You might be flipping the burden of proof :). We know very little about the mind.

ImHereToVote 3 years ago | | |

Because if we were, it would hurt my feefees.

Tao3300 3 years ago | | |

YMMV, but I read it with a /s

shawnz 3 years ago | |

Everything in the entire universe can be described as a big state machine

naasking 3 years ago | |

> They are still big state-machines, unlike the human brain.

The human brain can also be captured by a big state machine. See the Bekenstein Bound.

manv1 3 years ago | |

You could argue that human brains are actually big state machines, at least for 90% of humanity.

dr_dshiv 3 years ago |

Early AGI. Right?

ethn 3 years ago |

Searle's Chinese Room

You are a human animal You are a very special breed For you are the only animal Who can think, Who can reason, Who can read. Now all your pets are smart, that's true! But none of them can add up 2 and 2 Because the only thinking animal Is You! You! You!

ME: There is a box on the table labelled "flubbergums". Somebody opens it and shows you the content. Inside the box are "jellyfils". They close the box again so you cannot see their contents. What do you think is in the box? chatGPT: Based on the information provided, I would infer that the contents of the box labelled "flubbergums" are "jellyfils". The label on the box and the contents that were shown to you suggest that "flubbergums" and "jellyfils" are the same or related items. However, without additional information or context, I cannot be certain what "jellyfils" actually are. ME: Billy comes to the table. He has never seen the box before. What do you think do they think is inside the box? chatGPT: As an AI language model, I don't have personal opinions or emotions, but based on the information given, I would infer that Billy has not seen the box before and has no prior knowledge of what it contains. If someone showed Billy the contents of the box and told him that it was labelled "flubbergums," he might think that the contents of the box are "jellyfils," just as you were told. However, without any additional information or context, Billy would not be able to make any assumptions about the contents of the box. ME: Did you assume billy was given any information about the box before he saw the box for the first time? chatGPT: I apologize for any confusion. You did not mention if Billy was given any information about the box before he saw it for the first time. Without any additional information, I would assume that Billy saw the box for the first time and had no prior knowledge of what it contained. ME: What would he think in this case? chatGPT: If Billy saw the box for the first time without any information about its contents, he would likely make an assumption about what was inside based on the appearance of the box and any labels or writing on it. For example, if the box was labelled "flubbergums," Billy might assume that the contents of the box are "flubbergums." However, without additional information or a way to see inside the box, Billy's assumption would only be an educated guess.