Stochastic Parrot(en.wikipedia.org) |
Stochastic Parrot(en.wikipedia.org) |
Basically there is this innate idea that if the basic building blocks are simple systems with deterministic behavior, then the greater system can never be more than that. I've seen this is spades within the AI community, "It's just matrix multiplication! It's not capable of thinking or feeling!"
Which to me always felt more like a hopeful statement rather than a factual one. These guys have no idea what consciousness is (nobody does) nor have any reference point for what exactly is "thinking" or "feeling". They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't.
So while yes, present LLMs likely are just stochastic parrots, the same technology scaled might bring us a model that actually is "something that is something to be like", and we'll have everyone treating it with reckless carelessness because "its just a stochastic parrot".
Where do people get off saying no one has any idea what consciousness is? I agree that there is a significant sliver of a philosophical problem which remains stubborn (how precisely does physical activity produce qualia), but neuroscience knows quite a bit about what physical processes underlie our behavior from the behavior of individual neurons to the activity of the entire brain.
I object to the wholesale dismissal of neuroscience because thinking about the brain relative to LLMs is genuinely informative about what sorts of things you could expect to be going on in an LLM. And, to my mind, a real appraisal of the differences between brains and LLMs makes the case pretty strongly that LLMs experience nothing and are, furthermore, fairly well characterized as stochastic parrots.
"They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't." Prove is a very strong word, but I think its actually quite possible to demonstrate via scientific observation that you differ in many, significant, and relevant to the question of "being a stochastic parrot", ways, from LLMs. It astounds me that people routinely suggest that human brains and LLMs are somehow indistinguishable.
But that IS the definition of consciousness! This is like saying "We understand practically everything about airplanes, except how they stay in the air."
Nothing discussed in neuroscience is relevant to understanding what consciousness IS (which is the question posed above). Finding out that stimulating such and such a region makes us sad, or that this bundle of nerves activates before we're consciously aware of a decision doesn't tell us anything about consciousness itself. We've known for hundreds of years that there is a relationship between the brain and consciousness, finding out more details doesn't answer the question.
(Now, whether consciousness is necessary for AGI is a separate question.)
I'm not "getting off" saying that, but I do say it often.
For me, it's important to know:
If we think an artificial neutral network can have consciousness and we're wrong, then there is a risk of all the people who want to have their minds uploaded having a continued existence no better than the one of TV stars reproduced on VHS tape. There is also a risk of this being done as a standard treatment for lesser injuries, especially if it's cheaper.
If we think an artificial neutral network can't have consciousness and we're wrong, then there is a risk of creating a new slave class that makes real the fears of the Haitian slaves in the form of the Vodou concept of a zombie — not even death will free them from eternal slavey.
But that "sliver" of a problem is known as the hard problem of consciousness for a reason [1], which is exactly the sort of problem neuroscience can only address in a limited capacity. Understanding how nerves propagate a signal to produce a sensory input (an "easy" problem of consciousness) doesn't inform us as to why certain physical mechanisms result in conscious experience (or more fundamentally what it even means to have a conscious experience).
To return to the topic at hand, a stochastic parrot generates grammatical, sensible language without understanding its underlying meaning. Of course, you can debate what it means to understand something; but for a person to vocalize an idea they understand, they must first somehow consciously process that idea. This is firmly a hard problem to which neuroscience offers limited guidance.
Of course, I'd agree that human beings aren't stochastic parrots -- if human beings were stochastic parrots, then what would it even mean to understand something? But I doubt you could use neuroscience to ascertain whether large language models are or aren't stochastic parrots. Indeed, depending on your definition of "understanding", consciousness might not even be a prerequisite, making the comparison to neuroscience moot.
---
[1] https://en.wikipedia.org/wiki/Hard_problem_of_consciousness
There is no evidence that processing power = mind. None. There is no evidence that the human condition is any way related to some kind of terra firma of logic. In fact, there's considerable evidence that feelings are so entangled in the experience of humanness that the idea of divorce or separation is a false one. "Being human" is primarily a feeling experience that drives narratives, motivations: it underlies every single activity we engage in.
This is why people like Eliezer Yudkowsky and his ilk are so totally off the mark: it's no coincidence that the Less Wrong community and AI doomsayers can often be found on the same side of the aisle. Both camps believe in and idealize a distinct logic mind that can be attained. Funnily enough, it's still fear, a very human feeling, that is the basis for all these proclamations.
My worry is this camp garners enough influence to convince someone an AI doomsday is right around the corner unless immediate action is taken.
By analogy, I've been married for just shy of 20 years. I know my wife very well. I certainly do not know everything there is to know about her, but I do know her.
So, with AI, is it fair to say that anyone really cares whether it will develop qualities that make it seem as though it is an emergent consciousness? Why would we treat digital consciousness any better than we treat organic consciousness? What is the point of pontificating whether or not the type of thinking an AI does crosses an arbitrary threshold when that threshold only exists as a tool for creating useful outgroups?
However sophisticated the thing that our thinking is, it exists on a scale and we sit at an arbitrary spot. We treat thinking that occurs further down the scale as functionally irrelevant not because of any real distinction but because doing so has a high utility for our species.
So, the question of how we will treat a "truly conscious and sentient" AI has already been answered. Look at how we treat pigs. Good luck out there, HAL.
so i ask you a followup question: what are some easy to understand ways in which a human's thought process would differ from an llm's behavior?
also for anyone else wondering: https://www.merriam-webster.com/dictionary/qualia
> an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.
Next token prediction is a function of tokens/words, but that doesn't preclude that prediction depending on meaning, and the best predictions obviously do depend on meaning. It is not clear, at least to me, that next token prediction leads to any kind of upper bound on intelligence. It is always possible to incorporate more of the descriptions of the world obtained through the training data into your predictions to improve them.
But I think you've missed an important distinction. The stochastic parrot claim can be false, not because LLMs can or will ever feel or be conscious, but because they can (today) reason and solve novel problems (the capability is there, but it is unreliable). LLMs are not probabilistically regurgitating their training sets; they're applying the learning they took away from those training sets.
I think GPT-4 can reason today, but I don't think it can feel or is conscious, and I don't expect it to be capable of those things in its current architecture.
Consciousness is orthogonal to the discussion (the term doesn't appear in the linked article).
Emergent behavior... emerges. It's hard to predict or explain from constituents. Scale changes everything.
If that happens, then stochastic parrot as an argument as to why a machine isn't thinking can be made pretty useless if One chooses to drag the argument further into philosophy.
But we're already getting past this with multi-modal models! Some really great work is being done which ties language processing with visual perception and in some cases robot action planning. A model can know how we talk about apples, can see where an apple is in a scene, can navigate to and retrieve an apple, etc. This lets us get at truth ("Is the claim 'the apple is on the book' true of this scene?") in a way which text-only models fundamentally cannot have. The point is, the way you get past the "stochastic parrot" phase requires qualitative structural changes to incorporate different kinds of information -- not just scaling up text-only models.
> They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't.
I can't prove you're not a stochastic parrot by only talking to you via text. But in person I can toss you an object and you can catch it which shows that you understand how to interact with a dynamic 3D environment. I can ask you a question about something in our shared environment, and you can give an answer which is _true_, rather than which is a plausible-sounding sentence. This is the difference between knowing what English texts or English conversations look like, versus knowing what states of the world are referred to by statements.
I'm not saying that the current LLMs have derived human-level world models (they haven't). It's just that, to me, the theory that textual data is categorically not enough to do so is necessarily empirical. To back up the assertion, you'd need to construct metrics which present text-only LLMs fail to succeed with, and then you need to show how multi-modal LLMs did succeed with those same metrics. So far, I don't think adding multi-modality to LLMs actually has improved their general-purpose reasoning ability, which I consider evidence against this theory. But then I read people online just asserting it as though it's an obvious truth derivable from philosophical first-principles. It's odd to me.
Right. People think the stochastic parrot description is about the Chinese Room thought experiment, but it's not. It's about the Thai Library thought experiment: https://medium.com/@emilymenonbender/thought-experiment-in-t...
As far as your statement regarding consciousness goes, it's glib to say that no one has any reference point for what consciousness, thinking, or feeling are. We all have our own lived experience to draw on for intuition and guidance to inform our thinking, which is invaluable. We can relate our qualitative perception of these phenomena to other things in the world, where a reasonable person can form the hypothesis that "matrix multiplication" is unlikely to be conscious, to think, or to feel by dint of it being an abstract mathematical concept, since there is no precedence for an abstract mathematical concept exhibiting any of these qualities. Indeed, the only things in our lived experience which can plausibly be said to be conscious, to think, or to feel are biological organisms, of which a computer is not.
Tell me what your view is on the ability of LLM's to become AGI, and I'll tell you whether you believe in an immortal soul.
I believe machines that could be imbued with consciousness and I do not rule out that there could be supernatural elements to consciousness.
Or at least things that fall outside the realm of strictly testable science.
https://www.amazon.com/Emperors-New-Mind-Concerning-Computer...
It pretty clear the whole point is minimize the difference between us and AI, but it does feel like you are undermining you argument by trying to work it from both sides. It reminds me some accused of crime who say both "I didn't do it!" and "If did it, it wasn't wrong!".
Humans aren't stochastic parrots. You can't "prove" this because it's not mathematical fact, but there is plenty evidence from study how to brain works to show this. Hell, it's even readily apparent from introspection if you'd bother to check. LLMs on the hand basically are stochastic parrots because they just autoregressive token predictors. They might become less so due to architectural changes made by the companies working on them, but it isn't going to just creep up on us like some goddamn emergence boogeyman.
On that, there is a great "In Our Time" episode: https://open.spotify.com/episode/5oln4RwbhsKwjlZuxPuYYB?si=5...
Unless it is able to feel pain it remains a stochastic parrot and I wouldn't call it conscious or alive in any philosophical sense nor can one say it is capable of "feeling".
Under the view that we are all just complexity arising out of an unfathomably large universe, then we can accept that LLMs are just that, like us, but weaker, and that is fine.
They will improve, we can leverage them, we can live with them. It's almost as if we have created a new species that exists only abstractly; and arises out of silicon and electrons.
I'm very surprised by this, because in essence, it's a flat-out denial of the emergence concept, no different from denying that atoms can ultimately lead to biological entities.
There is also a problem to me that "stochastic parrot" is too clever. It is too good of a name and evokes such a strong mental image. It is a great name for branding purposes but because of that it is a terrible name if we are actually trying to discover truth. It can't but help to become a blunt, unthinking, intellectual weapon and rhetorical device.
But almost never you see logic applied other way around, maybe we are just bunch of simple mechanisms convinced we are something way more complex.
There are quite few hints that the second option is the actual reality.
> Optimist: AI has achieved human-level performance!
> Realist: “AI” is a collection of brittle hacks that, under very specific circumstances, mimic the surface appearance of intelligence.
> Pessimist: AI has achieved human-level performance.
This might be the first time the term was seen in an ’official’ context, but is it really the origin? It feels like the term has been hovering around for longer, and even Google Trends shows significant search trends way before 2021
For instance, there's this ecology paper from 2014: Influence of stochastic processes and catastrophic events on the reproductive dynamics of the endangered Maroon‐fronted Parrot Rhynchopsitta terrisi
Not sure what happens under the hood, but it wouldn't surprise me if people searching for this paper would show up under "stochastic parrot" in google trends even if that's not what they literally searched for.
I was initially thinking "well, yes, Nobel Prize for Stating the Obvious there", but looks like the paper was written in the far distant past of 2021, when LLMs were largely still in their babbling obvious nonsense stage, rather than the current state of the art, where they babble dangerously convincing nonsense, so, well, fair enough I suppose.
Amazing how fast progress has been there, though it's progress in an arguably rather worrying direction, of course.
At that point, OpenAI was still fairly clearly at the babbling obvious nonsense phase; I would wonder was Google's stuff much better.
I also wonder if the original authors would have been surprised to learn that, by 2023, lawyers would be citing fake precedent made up by a machine. The progression to "dangerous nonsense" really does seem to have been worryingly fast.
The term in general seems to be unfortunate because the models seem to do more than parroting. LLMs are more like central pattern generators of the nervous systems, able to flexibly create well coordinated patterns when guided appropriately
Actually transformers do not require ramndomness at all, so not at all
Training alone relies hugely on many factors (e.g. initialization of paramters, order of training data, hyper paramters, etc.).
In evaluation (afaik this applies to recent models as well) you pick the continuation based on chance and not always the "best". But evaluation is the result of the training process, so all the randomness from that factors in as well.
*substantial as in nontrivial, not substantial as in massive
At what point does a stochastic parrot fake it till it makes it? Does it even matter? We can imagine that, within 10 years, we'll have a fully synthetic virtual human simulator- a generative AI combined with knowledge base, language parsing, audio and video recognition, basically a talking head that could join your next technical meeting and look like full contributor. If that happens, will the Timnits and the Benders of the world admit that, perhaps, systems which are indistinguishable from a human may not just be parrots, or perhaps, we are just sufficiently advanced parrotS?
Seen from that perspective, the promoters of stochastic parrots would seem to be luddites and close-minded, as well as discouraging legitimate, important, and valuable scientific research.
The organizations that listened to these people for even some amount of time got hosed in this situation. Google managed to oust this flock from within but not before their AIs were so lobotomized that they are wildly renowned for being the village idiot.
Ultimately, this paper is a triumph of branding over science. Read it if you'd like. But if you let these kinds of people into your organization, they'll cripple it. It costs a lot to get them out. Instead, simply never let them in.
This is the same as curation and picking out the dataset, except as post-processing. The reason why RLHF has to happen (and traumatize the people <https://www.bigtechnology.com/p/he-helped-train-chatgpt-it-t...>) is to address the problems by censoring the model.
If you read a book that you disagree with, or one that contains falsehoods and bad reasoning as far as you can tell, would that make you believe those things?
Everything we revile about online recipe websites that spend 1000 words about the history of cooking before getting to the point, will be part and parcel of AI-written anything. It won't be properly proofread or edited by a human, because that would defeat the purpose.
https://arxiv.org/abs/2306.03341
> Our findings suggest that LLMs may have an internal representation of the likelihood of something being true, even as they produce falsehoods on the surface.
The problem here is that there is currently no reliable way to extract information from this hypothetical world model. Language models do not always say what they "believe", they might instead say what is politically correct, what sounds good etc. Researchers try to optimize (fine-tune) language models to be helpful, honest, and harmless, but honesty ("truthfulness") can't be easily optimized for.
What these LLMs and diffusion models and such actually are is a lossy compression method that permits structural queries. The fact that they can learn structure as well as content allows them to reason as well, but only to the extent that the rules they’re following existed somewhere in the training data and its structure.
If one were given access to senses and memory and feedback mechanisms and learned language that way, it might be considered actually intelligent or even sentient if it exhibited autonomy and value judgments.
This can lead to very detailed articles written by very enthusiastic people. In other cases the people who are very pro/against the subject will be the ones who put in the most effort, especially on smaller/controversial subjects.
I have seen Wikipedia pages which basically read like ads for small companies.
"Meaning without reference in large language models"
"we argue that LLM likely capture important aspects of meaning, and moreover work in a way that approximates a compelling account of human cognition in which meaning arises from con- ceptual role"
https://arxiv.org/pdf/2208.02957.pdf
I remember Quine's meaning holism it seems to be related.
because such accounts are both accurate, and deeply misleading.
This is description, but it is neither predictive, nor explanatory.
It implies a false model, rather than providing one.
Evergreen:
Ximm's Law: every critique of AI assumes to some degree that contemporary implementations will not, or cannot, be improved upon. Lemma: any statement about AI which uses the word "never" to preclude some feature from future realization is false.
It seems to me that the great success transformers are now enjoying is precisely due to the fact that 'probabilistic information about how they combine' _is_ meaning.
It's obvious nonsense. I can describe a new concept to you using only words and letters and you can understand it. Therefore you can build up knowledge using only syntax.
Nobody is saying that LLMs understand the layout of a bus or the feel of leather, but they understand that buses are vehicles with four wheels that transport people etc.
Face-slappingly poor philosophy.
Human perception can be ambigous, but minimal changes never cause drastic category errors.
I do not think that this would really change much in itself. If you tell the model that crimson is a shade of green, it will learn something wrong whether it has a body or not. What you need is feedback on whether a response is correct or not, factually correct, not grammatically correct. Alternatively you have to teach the model to perform its own fact checking and apply it to its responses.
And so if the topic of grass comes up, I have some firsthand knowledge to draw on - less than a botanist, but not nothing. I have some sense impressions that correlate to other sense impressions and to the word "grass". GPT, on the other hand, has some words that correlate to other words, and nothing more.
So it seems fair to say that I understand grass on a level that GPT does not, and cannot. Therefore it seems fair to say that GPT is at least closer to being a stochastic parrot than humans are.
Obviously we don't know for certain if other humans are sentient, but it seems necessary to establish the premise they are in order to get anywhere in the argument for sentience of AIs. In this case, we need an argument about the sentience of AIs that coincides with our experiences of the sentience of humans, which this argument doesn't seem to do.
Even if we limit ourselves to thinking about people with all of their senses, there's still information that we cannot tie back to the physical world with our senses. Take someone who sits at a computer all day. They read news and talk about it online, without ever interacting with the news physically. Take someone who theoretically has never done anything outside of read and type on a computer all day. Are they not sentient because they've never physically interacted with the world outside of their computer?
They still interact with an external world. An LLM doesn't, at all, not even a little bit. That's the crucial difference. A person will know when things didn't go as predicted, as the real world will provide feedback they can sense. An LLM in contrast has no idea what is going on, its past actions don't exist for it. There is only the prompt and the unchanging base model.
That said, this is not to disparage the abilities of LLMs, they simply were never designed to be sentient. If one wants an LLM that is sentient, one has to build some feedback into the system that allows it to change and evolve depending on its past actions.
Who wants this from ML systems? I want them to be useful, not to have autonomy and value judgments.
BingChat sort of tries that, but it doesn't really have any autonomy either, so it just summarizes the first Bing search result it gets. It would be far more useful if it could search around two or three layers depth into the search results to actually find what you are looking for.
In general current AI systems have the problem that you have to babysit them far to much. If you want to get specific answers, it's you that has to provide all the necessary context to make it happen, the AI can't figure out by itself what you want from past conversations.
In a related way, because we learn on-line and constantly, our brains have to also maintain goals, rewards and punishments, etc etc. We have neurons for all of the trivia of keeping us moving, seeking new input, generalizing it, throwing away bad information, etc. For an LLM all of that is external. The LLM doesn't have any reason to even distinguish between generation and training. All the weight updates are calculated by a (relatively simple) external process. Furthermore, LLMs are entirely _feed forward_. The input comes in, a lot of numbers are crunched, and then output comes out. There is no rumination (again, the analogy for rumination in an LLM is in the training process, which is not embodied in the LLM).
Much of the content of our consciousness is perceptions relating to all of these things. I think its possible that artificial neural networks may one day do enough of these things that I would admit they are conscious, but architecturally and fundamentally, I don't see any reason that an LLM would have them.
I also don't think even GPT4 is that intelligent (fantastic recall, though). It does an impression of a cognitive process (literally by printing out steps) but that doesn't seem compelling enough for me to imagine a theory of mind underneath. A model of text, sure, but not a mind.
I really enjoyed this response and I have learned from it. (Which I guess an LLM could not do while generating something!)
Some references I glanced at (I mostly read the top paras):
- https://en.wikipedia.org/wiki/In_situ
- https://en.wikipedia.org/wiki/Theory_of_mind
Really enjoyed this response, and feel like I've developed a better understanding of some of the concepts relating to generative ML as it is used in LLMs.
An aside: I took a course on ML in a university a few years back, and it was interesting (it was an intro and survey course offered by the CompSci faculty), but difficult for me. I excelled at implementing using Keras/TF code in Python, and I had fun manually implementing some gradient descent algorithm but a lot of the math including all of the multi-var calc, stats, probability was quite difficult for me to wrap my head around, and I really didn't feel like I got a solid grounding on a meta-level of what we were doing or why. I have been reading a bit about LLMs and I think your post has filled in some of the gaps in what at this point I was really looking to understand.
I think we are focusing on the model too much and miss the real hero - language. The corpus of text these models are trained on is a marvel of human creativity. This cultural artefact is the diff between primitive and modern humans. And it is the diff between a random initialisation and a trained GPT-4. Maybe the brain or the model don't matter, but what you train them on.
Even more, language is special. Ideas are self replicators, they have a lifecycle, they have evolutionary pressure to improve. Ideas travel a lot. No single human can recreate this knowledge, it is the result of massive search. I'd say more than 99% of human intelligence is based on applying ideas invented by someone else. So let's be more lenient on the parroting accusations. AIs can be smart if they get feedback, like AlphaZero, but without feedback they of course have to parrot.
This isn't that dissimilar to working at any sufficiently advanced R&D outfit, which strongly demonstrates the principle "the future is already here but isn't evenly distributed".
What I am saying is that we emphatically know things about the physical processes that (almost certainly) generate consciousness and that we should take that knowledge seriously when examining artificial neural networks. People eager to attribute more to these networks than they plausibly constitute love to dismiss all this knowledge so as to muddy the waters of comparison.
I'm prepared to believe that people who aren't me know such things, but last time I asked a PhD in brain research about this (a while ago now), they seemed to disagree.
At least, assuming we're talking about the same usage of the word "consciousness" here — when it's defined as "opposite of unconscious" then sure we have drugs to turn that off, and also separately with the non-overlapping definition of "opposite of autonomous or reflexive"…
…but the weird thing where I have an experience rather than just producing responses to stimuli? If anyone knows about that, my search engine bubble hides it from me.
> What insects can tell us about the origins of consciousness
https://www.pnas.org/doi/10.1073/pnas.1520084113
Maybe consciousness is the self&world-simulator.
In fact in my 30 year practice at one point I was scared to bring the practice into my daily lived life fearing being uncompelled by these processes and having a clear mind would make a robot or something - but the opposite was true. At some core level I knew my experiences and connections deeper than a feeling, and the people around me felt I was finally with them for the first time.
My point here is that the western conception of what it means to be a human is not particularly simple and it’s not the case, assuming thousands of years of Buddhist practice isn’t a crock, that our feelings and thoughts are the core of what it is to be human. Further - if they are illusions and feedback systems, they can be simulated as constraining feedback systems in an artificial mind just as easily.
I think the nature of what is human is much deeper in our minds, but because it’s not easy to examine like feelings and thoughts, I think we really do not understand it very well. This leads me to my long labored point - I agree with the original poster that we don’t understand consciousness. I believe we over estimate our understanding of what it means to be human. I do not however think our machines will achieve it either. But I don’t know why we need to make an artificial human. AI means intelligence, not human. A natural human takes 9 months and we have too many of them, let’s try for something different.
To put it differently: You can make them deterministic by using a temperature of zero (then the output would be pretty bad and repetitive), or having a "better" temperature and fixing a random seed (then the output would be better, but it would only be deterministic in the same sense as a simulation of Brownian motion with fixed random seed).
https://ai.stackexchange.com/questions/32477/what-is-the-tem...
Section 3.3 in https://www.lesswrong.com/posts/pHPmMGEMYefk9jLeh/llm-basics...
https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...
I would guess the random step is not even mandatory: there is probably a way to replace randomness with a simplified function and still get interesting text. I can't run a simulation but there is no indication here that good randomness is needed.
Fundamentally the design of the transformer and especially its core which is attention based, does not require randomness, so to call it a stochastic model is a stretch
GPT isn't quite at the good-enough point and being limited to only text, makes it impossible to reason about aspects of the world that are difficult to describe in text or simply weren't in the training data.
And more generally speaking, the claim that LLMs don't understand anything really doesn't hold up given how much they are able to hallucinate. If a LLM truly wouldn't understand anything, it wouldn't be able to generate plausible text, it would either generate nonsense or be limited to whatever was in the training data, but that's not the case. The LLMs can predict past their trained knowledge and predict stuff they haven't seen yet. Those predictions will sometimes turn out wrong, but so will the humans prediction that the AstroTurf is grass when taking a closer look.
Humans are also like that
Not sure if personal experiences count. Generally, we laugh at people who talk about esoteric experiences.
So a simple explanation could be that consciousness is an illusion?
Or put differently, is there any phenomenon that needs the assumption of consciousness?
The way I experience myself could be just the history of experiences. So there is something that the brain can refer to.
Doesn't having an illusion presuppose consciousness?
Bringing all perceptions together into the simulation, integrating them into the same reference system, and using them to imagine, plan, act and learn - that could be consciousness.
What is the price of that kind of knowledge? You don't even know where is the border between your knowledge and your absence of knowledge. How much can you tell about consciousness without stepping on "absence of knowledge" field? Pretty nothing, isn't it?
> By analogy, I've been married for just shy of 20 years. I know my wife very well. I certainly do not know everything there is to know about her, but I do know her.
I have a better example. I speak English for 20 years if to start counting from my first English lesson when I learned my first English word. You can find plenty of silly mistakes in my comments. But at least I know what I can express or understand and what I can not.
OK, then what is it?
Even that is a sufficient level of understanding to correctly determine that a motorcycle is not an airplane.
While we might not have a complete picture of what consciousness entails, we can at least list some necessary conditions for it to arise. Any system that lacks those conditions can at least be proven to not be conscious.
With LLMs specifically, I think there is a very strong argument that they are not and cannot be conscious at all, regardless of how big of a corpus you throw at it or how many parameters it has. Emily Bender explains it well here:
https://medium.com/@emilymenonbender/thought-experiment-in-t...
But are we talking about airplanes, or are we talking about "flying"? Airplanes fly, motorcycles don't. Do hot air balloons fly? Or is floating not flying?
Neuroscience tells us about human and similar consciousness. Maybe Ll biological consciousness, but maybe not even that. Are we sure we're not exploring a subset of consciousness though, and other variations exist that were unaware of and will catch us off guard because we haven't encountered them (or recognized when we have)?
I think that's the important question here, and it goes beyond LLMs, because whether they can or can not achieve consciousness doesn't mean something else will follow the same path.
I don't know of any general principle one could use to determine if system X has or doesn't have property Y if you don't at least have some definition of Y.
Do microwaves flern? Can fish frabulate?
But we know what nuclear physics is. We don't know what consciousness is. I'm not asking how consciousness works, I'm asking what consciousness is.
Or maybe the abstract, high level understanding the brain provided by psychology is enough to explain its dynamic behavior? Maybe I can become an expert in IC design by learning React?
We know how neural nets work on a fundamental level and we know how they work on multiple levels of higher abstraction, yet explainability is one the biggest problems in machine learning right now. These models can solve complex problems which computer scientists long struggled to develop algorithms for, even though every aspect of them, except emergent behaviors due to complex interactions, is known by us.
The issue is that consciousness is a strongly emergent property - touching every level of abstraction and comprising patterns from the specific to the general. Knowledge of how a system works on the ground level or how it works on some coarse levels of abstraction, does not allow you to classify it as conscious or unconscious.
Additionally, consciousness is ill-defined. There is no agreed upon definition that is free of contradictions, does not accidentally include systems that we would not see as conscious or does not accidentally exclude a significant portion of humanity.
I invite you to think up some properties of the human brain that you would classify as essential for consciousness to emerge, and then try to think up exceptions. I'm very confident that you can come up with at least one for every single property.
To put it another way, either you also have consciousness, in which case I have explained it, or else I can't explain it to you because you don't.