Stochastic Parrot

125 points by miobrien 3 years ago | 161 comments

I worry that the "stochastic parrot" was premature, an idea sown early in development that will now carry along through any advances made.

Basically there is this innate idea that if the basic building blocks are simple systems with deterministic behavior, then the greater system can never be more than that. I've seen this is spades within the AI community, "It's just matrix multiplication! It's not capable of thinking or feeling!"

Which to me always felt more like a hopeful statement rather than a factual one. These guys have no idea what consciousness is (nobody does) nor have any reference point for what exactly is "thinking" or "feeling". They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't.

So while yes, present LLMs likely are just stochastic parrots, the same technology scaled might bring us a model that actually is "something that is something to be like", and we'll have everyone treating it with reckless carelessness because "its just a stochastic parrot".

nathan_compton 3 years ago | |

"These guys have no idea what consciousness is (nobody does)"

Where do people get off saying no one has any idea what consciousness is? I agree that there is a significant sliver of a philosophical problem which remains stubborn (how precisely does physical activity produce qualia), but neuroscience knows quite a bit about what physical processes underlie our behavior from the behavior of individual neurons to the activity of the entire brain.

I object to the wholesale dismissal of neuroscience because thinking about the brain relative to LLMs is genuinely informative about what sorts of things you could expect to be going on in an LLM. And, to my mind, a real appraisal of the differences between brains and LLMs makes the case pretty strongly that LLMs experience nothing and are, furthermore, fairly well characterized as stochastic parrots.

"They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't." Prove is a very strong word, but I think its actually quite possible to demonstrate via scientific observation that you differ in many, significant, and relevant to the question of "being a stochastic parrot", ways, from LLMs. It astounds me that people routinely suggest that human brains and LLMs are somehow indistinguishable.

SamBam 3 years ago | | |

> I agree that there is a significant sliver of a philosophical problem which remains stubborn (how precisely does physical activity produce qualia)

But that IS the definition of consciousness! This is like saying "We understand practically everything about airplanes, except how they stay in the air."

Nothing discussed in neuroscience is relevant to understanding what consciousness IS (which is the question posed above). Finding out that stimulating such and such a region makes us sad, or that this bundle of nerves activates before we're consciously aware of a decision doesn't tell us anything about consciousness itself. We've known for hundreds of years that there is a relationship between the brain and consciousness, finding out more details doesn't answer the question.

(Now, whether consciousness is necessary for AGI is a separate question.)

ben_w 3 years ago | | |

> Where do people get off saying no one has any idea what consciousness is?

I'm not "getting off" saying that, but I do say it often.

For me, it's important to know:

If we think an artificial neutral network can have consciousness and we're wrong, then there is a risk of all the people who want to have their minds uploaded having a continued existence no better than the one of TV stars reproduced on VHS tape. There is also a risk of this being done as a standard treatment for lesser injuries, especially if it's cheaper.

If we think an artificial neutral network can't have consciousness and we're wrong, then there is a risk of creating a new slave class that makes real the fears of the Haitian slaves in the form of the Vodou concept of a zombie — not even death will free them from eternal slavey.

panda-giddiness 3 years ago | | |

> I agree that there is a significant sliver of a philosophical problem which remains stubborn (how precisely does physical activity produce qualia)

But that "sliver" of a problem is known as the hard problem of consciousness for a reason [1], which is exactly the sort of problem neuroscience can only address in a limited capacity. Understanding how nerves propagate a signal to produce a sensory input (an "easy" problem of consciousness) doesn't inform us as to why certain physical mechanisms result in conscious experience (or more fundamentally what it even means to have a conscious experience).

To return to the topic at hand, a stochastic parrot generates grammatical, sensible language without understanding its underlying meaning. Of course, you can debate what it means to understand something; but for a person to vocalize an idea they understand, they must first somehow consciously process that idea. This is firmly a hard problem to which neuroscience offers limited guidance.

Of course, I'd agree that human beings aren't stochastic parrots -- if human beings were stochastic parrots, then what would it even mean to understand something? But I doubt you could use neuroscience to ascertain whether large language models are or aren't stochastic parrots. Indeed, depending on your definition of "understanding", consciousness might not even be a prerequisite, making the comparison to neuroscience moot.

---

[1] https://en.wikipedia.org/wiki/Hard_problem_of_consciousness

emptysongglass 3 years ago | | |

Unfortunately, this is the thread that all these arguments begin: demonstrative ignorance of either the science, the research, or both. They originate in ignorance and because of ignorance, the response is terrible fear and prophecies of the End Times. It's no different than any other brand of fear.

There is no evidence that processing power = mind. None. There is no evidence that the human condition is any way related to some kind of terra firma of logic. In fact, there's considerable evidence that feelings are so entangled in the experience of humanness that the idea of divorce or separation is a false one. "Being human" is primarily a feeling experience that drives narratives, motivations: it underlies every single activity we engage in.

This is why people like Eliezer Yudkowsky and his ilk are so totally off the mark: it's no coincidence that the Less Wrong community and AI doomsayers can often be found on the same side of the aisle. Both camps believe in and idealize a distinct logic mind that can be attained. Funnily enough, it's still fear, a very human feeling, that is the basis for all these proclamations.

My worry is this camp garners enough influence to convince someone an AI doomsday is right around the corner unless immediate action is taken.

ARandomerDude 3 years ago | | |

Likewise, the nobody knows what consciousness is mindset is smuggling in the idea that in order to know something at all, you must know it comprehensively. I know exactly what consciousness is based on my experience with it, even though I could not possibly give a comprehensive account of everything consciousness entails.

By analogy, I've been married for just shy of 20 years. I know my wife very well. I certainly do not know everything there is to know about her, but I do know her.

fnovd 3 years ago | | |

Here's a thought experiment: what are the qualities of a parrot that would make its consciousness different than that of a stochastic parrot? For that matter, what are the qualities that separate a parrot's consciousness from a human's? Or a human's from a pig's? Given the way we treat pigs, we clearly don't think consciousness in and of itself is worthy of any formal consideration, as long as the benefit we derive from exploiting it is high enough.

So, with AI, is it fair to say that anyone really cares whether it will develop qualities that make it seem as though it is an emergent consciousness? Why would we treat digital consciousness any better than we treat organic consciousness? What is the point of pontificating whether or not the type of thinking an AI does crosses an arbitrary threshold when that threshold only exists as a tool for creating useful outgroups?

However sophisticated the thing that our thinking is, it exists on a scale and we sit at an arbitrary spot. We treat thinking that occurs further down the scale as functionally irrelevant not because of any real distinction but because doing so has a high utility for our species.

So, the question of how we will treat a "truly conscious and sentient" AI has already been answered. Look at how we treat pigs. Good luck out there, HAL.

avg_dev 3 years ago | | |

thank you for this. i am not well up on consciousness (or machine learning) and i have seen chatbots/llms hallucinate and such, and i have also seen them do amazing things. i have wondered to myself a few times lately: how do i know that im dissimilar in nature from these things?

so i ask you a followup question: what are some easy to understand ways in which a human's thought process would differ from an llm's behavior?

also for anyone else wondering: https://www.merriam-webster.com/dictionary/qualia

vernon99 3 years ago | | |

This is a very interesting point that requires some examples and further elaboration to have value for the readers. It refutes but doesn't provide arguments. Can you please elaborate?

loandbehold 3 years ago | | |

Is there ANY evidence that "qualia" is a real thing? That sounds like vitalism that was debunked a long time ago.

cjbprime 3 years ago | |

I think that some LLMs (mainly just GPT-4) should be considered as refutations to the Stochastic Parrot idea, which was published in March 2021 and claims no LLM can have "any model of the world". It was a reasonable (though perhaps overconfident) paper for authors who had only used GPT-3 to publish, but there is now ample evidence of world modeling, including published academic evidence, for GPT-4. I think the following claim from the paper is also deeply incorrect and confused:

> an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.

Next token prediction is a function of tokens/words, but that doesn't preclude that prediction depending on meaning, and the best predictions obviously do depend on meaning. It is not clear, at least to me, that next token prediction leads to any kind of upper bound on intelligence. It is always possible to incorporate more of the descriptions of the world obtained through the training data into your predictions to improve them.

But I think you've missed an important distinction. The stochastic parrot claim can be false, not because LLMs can or will ever feel or be conscious, but because they can (today) reason and solve novel problems (the capability is there, but it is unreliable). LLMs are not probabilistically regurgitating their training sets; they're applying the learning they took away from those training sets.

I think GPT-4 can reason today, but I don't think it can feel or is conscious, and I don't expect it to be capable of those things in its current architecture.

FabHK 3 years ago | |

The notion of the stochastic parrot is that the system produces plausible sounding words, but does not understand, does not exhibit intelligence.

Consciousness is orthogonal to the discussion (the term doesn't appear in the linked article).

cjalmeida 3 years ago | |

Indeed. It's like saying life is just long strands of DNA composed of only four simple molecules. Or economy is just people trading their surpluses.

Emergent behavior... emerges. It's hard to predict or explain from constituents. Scale changes everything.

ac2u 3 years ago | |

Agreed. I think the stochastic parrot concept is useful to ground our expectations on LLMs for now, but it could outlive it's usefulness if there ends up being multiple jumps in sophistication similar to that of GPT-2 to 4 in the next 10 years.

If that happens, then stochastic parrot as an argument as to why a machine isn't thinking can be made pretty useless if One chooses to drag the argument further into philosophy.

abeppu 3 years ago | |

I disagree. The criticism is _not_ that basic building blocks cannot be combined to produce something richer. The issue is the "without any reference to meaning" part of the quoted definition from Bender in that article. Models which are _only_ trained on text do not have a grounding to relate linguistic forms to anything else. When you know what an apple is, it's in part because you've seen and touched and tasted and eaten one. The model only knows how people talk about apples, and which texts are plausible, but not which ones are true.

But we're already getting past this with multi-modal models! Some really great work is being done which ties language processing with visual perception and in some cases robot action planning. A model can know how we talk about apples, can see where an apple is in a scene, can navigate to and retrieve an apple, etc. This lets us get at truth ("Is the claim 'the apple is on the book' true of this scene?") in a way which text-only models fundamentally cannot have. The point is, the way you get past the "stochastic parrot" phase requires qualitative structural changes to incorporate different kinds of information -- not just scaling up text-only models.

> They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't.

I can't prove you're not a stochastic parrot by only talking to you via text. But in person I can toss you an object and you can catch it which shows that you understand how to interact with a dynamic 3D environment. I can ask you a question about something in our shared environment, and you can give an answer which is _true_, rather than which is a plausible-sounding sentence. This is the difference between knowing what English texts or English conversations look like, versus knowing what states of the world are referred to by statements.

ambrozk 3 years ago | | |

By your definition, is a blind person capable of reasoning about visual data? Is a deaf person capable of reasoning about auditory data? Can a physicist understand the molecules, atoms, & subatomic particles which he or she can only interact with via a fundamentally textual theory? I would submit that there's no fundamental reason why an LLM needs access to more than text to derive human-level world models.

I'm not saying that the current LLMs have derived human-level world models (they haven't). It's just that, to me, the theory that textual data is categorically not enough to do so is necessarily empirical. To back up the assertion, you'd need to construct metrics which present text-only LLMs fail to succeed with, and then you need to show how multi-modal LLMs did succeed with those same metrics. So far, I don't think adding multi-modality to LLMs actually has improved their general-purpose reasoning ability, which I consider evidence against this theory. But then I read people online just asserting it as though it's an obvious truth derivable from philosophical first-principles. It's odd to me.

NoGravitas 3 years ago | | |

> I disagree. The criticism is _not_ that basic building blocks cannot be combined to produce something richer. The issue is the "without any reference to meaning" part of the quoted definition from Bender in that article.

Right. People think the stochastic parrot description is about the Chinese Room thought experiment, but it's not. It's about the Thai Library thought experiment: https://medium.com/@emilymenonbender/thought-experiment-in-t...

circuit10 3 years ago | | |

You can say the same about humans, we only experience an approximation of the real world via our senses, never the “real thing”, so can we “truly understand” it? Yes, in the sense that we can reason about it and make and test predictions about the parts we can understand. The world we experience is based on our senses, and that’s what what we understand. A LLM’s world is text, and there’s no reason it doesn’t “truly understand” the concepts that it’s using any less than humans do

sfpotter 3 years ago | |

Stochastic parrots have nothing directly to do with consciousness. You might consider reading the paper or at least the definition on Wikipedia more carefully.

As far as your statement regarding consciousness goes, it's glib to say that no one has any reference point for what consciousness, thinking, or feeling are. We all have our own lived experience to draw on for intuition and guidance to inform our thinking, which is invaluable. We can relate our qualitative perception of these phenomena to other things in the world, where a reasonable person can form the hypothesis that "matrix multiplication" is unlikely to be conscious, to think, or to feel by dint of it being an abstract mathematical concept, since there is no precedence for an abstract mathematical concept exhibiting any of these qualities. Indeed, the only things in our lived experience which can plausibly be said to be conscious, to think, or to feel are biological organisms, of which a computer is not.

Joeri 3 years ago | |

In a sense LLM's are a Rorschach test for people's beliefs about consciousness. If you believe consciousness to be an emergent phenomenon derived from simple deterministic biological processes, then it is not a big leap to believe LLM's to be on a roadmap to consciousness. If you instead believe consciousness is a supernatural phenomenon, then you will discard the very idea of a computer having a consciousness, because a machine could never be imbued with one by mere algorithms.

Tell me what your view is on the ability of LLM's to become AGI, and I'll tell you whether you believe in an immortal soul.

piloto_ciego 3 years ago | | |

Oh, I definitely don’t think this is correct because I at least for me the contra positive is not true.

I believe machines that could be imbued with consciousness and I do not rule out that there could be supernatural elements to consciousness.

Or at least things that fall outside the realm of strictly testable science.

shimfish 3 years ago | |

Roger Penrose wrote about this decades ago, arguing that computers cannot contain consciousness and neither can anything in currently understood physics.

https://www.amazon.com/Emperors-New-Mind-Concerning-Computer...

dekhn 3 years ago | | |

he never came up with any useful experiments that would demonstrate support for his position. Nor did he make a convincing theoretical argument.

gisely 3 years ago | |

It's weird seeing comments like this that argue simultaneously: 1) LLMs aren't stochastic parrots anymore! 2) You can't prove humans aren't stochastic parrots!

It pretty clear the whole point is minimize the difference between us and AI, but it does feel like you are undermining you argument by trying to work it from both sides. It reminds me some accused of crime who say both "I didn't do it!" and "If did it, it wasn't wrong!".

Humans aren't stochastic parrots. You can't "prove" this because it's not mathematical fact, but there is plenty evidence from study how to brain works to show this. Hell, it's even readily apparent from introspection if you'd bother to check. LLMs on the hand basically are stochastic parrots because they just autoregressive token predictors. They might become less so due to architectural changes made by the companies working on them, but it isn't going to just creep up on us like some goddamn emergence boogeyman.

Workaccount2 3 years ago | | |

Believe it or not, actual deep introspection (meditation, mindfulness) will make you realize you are more like a stochastic parrot, not less like one.

pseudotrash 3 years ago | |

> These guys have no idea what consciousness is (nobody does) ...

On that, there is a great "In Our Time" episode: https://open.spotify.com/episode/5oln4RwbhsKwjlZuxPuYYB?si=5...

Unless it is able to feel pain it remains a stochastic parrot and I wouldn't call it conscious or alive in any philosophical sense nor can one say it is capable of "feeling".

PartiallyTyped 3 years ago | |

Frankly speaking, I think this is more on us for thinking that we are special or that living forms are special. This is just our ego talking.

Under the view that we are all just complexity arising out of an unfathomably large universe, then we can accept that LLMs are just that, like us, but weaker, and that is fine.

They will improve, we can leverage them, we can live with them. It's almost as if we have created a new species that exists only abstractly; and arises out of silicon and electrons.

pizza234 3 years ago | |

> Basically there is this innate idea that if the basic building blocks are simple systems with deterministic behavior, then the greater system can never be more than that. I've seen this is spades within the AI community

I'm very surprised by this, because in essence, it's a flat-out denial of the emergence concept, no different from denying that atoms can ultimately lead to biological entities.

itairal 3 years ago | |

Totally agree.

There is also a problem to me that "stochastic parrot" is too clever. It is too good of a name and evokes such a strong mental image. It is a great name for branding purposes but because of that it is a terrible name if we are actually trying to discover truth. It can't but help to become a blunt, unthinking, intellectual weapon and rhetorical device.

me_me_me 3 years ago | |

I have noticed people look at animal or machines and try to figure out if they are as complex as humans in order to figure out if they are conscious or not.

But almost never you see logic applied other way around, maybe we are just bunch of simple mechanisms convinced we are something way more complex.

There are quite few hints that the second option is the actual reality.

fnordpiglet 3 years ago | |

The future is made by those that look past what is and see what might be and fail half way to achieving it. The rest are mired in their attachments and will never escape a murky prison of what today appears to be.

ChatGTP 3 years ago | |

I think you know what consciousness is ? It’s basically your life ?

isp 3 years ago |

Topical tweet from 2018:

> Optimist: AI has achieved human-level performance!

> Realist: “AI” is a collection of brittle hacks that, under very specific circumstances, mimic the surface appearance of intelligence.

> Pessimist: AI has achieved human-level performance.

https://twitter.com/dmimno/status/949302857651671040

mach1ne 3 years ago |

>"stochastic parrot" is a term coined by Emily M. Bender in the 2021 artificial intelligence research paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?"

This might be the first time the term was seen in an ’official’ context, but is it really the origin? It feels like the term has been hovering around for longer, and even Google Trends shows significant search trends way before 2021

mtlmtlmtlmtl 3 years ago | |

I see only a few peaks starting in 2008 that are in the double digit numbers. Could these just be queries containing the words "stochastic" and "parrot"?

For instance, there's this ecology paper from 2014: Influence of stochastic processes and catastrophic events on the reproductive dynamics of the endangered Maroon‐fronted Parrot Rhynchopsitta terrisi

Not sure what happens under the hood, but it wouldn't surprise me if people searching for this paper would show up under "stochastic parrot" in google trends even if that's not what they literally searched for.

naillo 3 years ago | |

I feel this way too. Maybe there's some very similar term that we're both thinking of though that is just at the tip of the tongue because I can't find what it'd be

post-it 3 years ago | | |

The Chinese room, maybe? https://en.wikipedia.org/wiki/Chinese_room

krapp 3 years ago | | |

Most people have never actually dealt with something like modern LLMs, so we haven't really developed the proper language to describe them and how they behave. It's either too simplistic and reductive (stochastic parrot, xerox machine) or presupposes sentience and intent ("fabricates","hallucinates", etc.)

canjobear 3 years ago | |

Their paper was the first time I heard the term.

ttpphd 3 years ago | |

If you find something, post it. Otherwise this sounds like sour grapes that women coined the term.

samgilb 3 years ago |

Fun fact: philosopher Regina Rini referred to GPT-3 as a "statistical parrot" six months before the Bender et al paper came out: https://dailynous.com/2020/07/30/philosophers-gpt-3/#rini

rsynnott 3 years ago |

> They go on to note that because of these limitations, a learning machine might produce results which are "dangerously wrong"

I was initially thinking "well, yes, Nobel Prize for Stating the Obvious there", but looks like the paper was written in the far distant past of 2021, when LLMs were largely still in their babbling obvious nonsense stage, rather than the current state of the art, where they babble dangerously convincing nonsense, so, well, fair enough I suppose.

Amazing how fast progress has been there, though it's progress in an arguably rather worrying direction, of course.

sharikous 3 years ago | |

Not to reduce the value of the insight, but since she coauthored the paper with Google employees she probably had access to models more advanced than those which were available to the general public

rsynnott 3 years ago | | |

I do wonder what the state of Google's stuff in 2021 was. Here's something produced by the 2020 version of GPT-3: https://www.aiweirdness.com/roses-are-red/

At that point, OpenAI was still fairly clearly at the babbling obvious nonsense phase; I would wonder was Google's stuff much better.

I also wonder if the original authors would have been surprised to learn that, by 2023, lawyers would be citing fake precedent made up by a machine. The progression to "dangerous nonsense" really does seem to have been worryingly fast.

daniel_reetz 3 years ago | | |

Thanks for pointing this out. I've spent years in R&D and awareness always lags technology.

seydor 3 years ago |

LLMs are not stochastic though, they are deterministic and dont even require random numbers, right?

The term in general seems to be unfortunate because the models seem to do more than parroting. LLMs are more like central pattern generators of the nervous systems, able to flexibly create well coordinated patterns when guided appropriately

dudebro314 3 years ago | |

Simulations of Brownian motion are not stochastic though, they are deterministic if you fix their random seed, right?

seydor 3 years ago | | |

Stochasticty is mandatory for modeling brownian motion.

Actually transformers do not require ramndomness at all, so not at all

constantcrying 3 years ago | |

All LLMs have some random aspects.

Training alone relies hugely on many factors (e.g. initialization of paramters, order of training data, hyper paramters, etc.).

In evaluation (afaik this applies to recent models as well) you pick the continuation based on chance and not always the "best". But evaluation is the result of the training process, so all the randomness from that factors in as well.

enragedcacti 3 years ago | |

They are stochastic in the domain of meaning. Minor syntactic changes to the prompt or changes to the seed can result in substantial* changes to the meaning of the response.

*substantial as in nontrivial, not substantial as in massive

8note 3 years ago | | |

Isn't that rather "unstable" or "poorly conditioned" ?

seydor 3 years ago | | |

Similar prompts give similar continuations, not wildly diverging, so no

dekhn 3 years ago |

The real question to me is: in the next decade, as ML researchers roll out progressively more sophisticated systems, we can expect that generative systems- which may actually be "only stochastic parrots"- are going to create works that would fool any reasonable human being.

At what point does a stochastic parrot fake it till it makes it? Does it even matter? We can imagine that, within 10 years, we'll have a fully synthetic virtual human simulator- a generative AI combined with knowledge base, language parsing, audio and video recognition, basically a talking head that could join your next technical meeting and look like full contributor. If that happens, will the Timnits and the Benders of the world admit that, perhaps, systems which are indistinguishable from a human may not just be parrots, or perhaps, we are just sufficiently advanced parrotS?

Seen from that perspective, the promoters of stochastic parrots would seem to be luddites and close-minded, as well as discouraging legitimate, important, and valuable scientific research.

NoGravitas 3 years ago | |

Once you have a knowledge base connected to the language model, it's no longer a Stochastic Parrot, but something else entirely. The point of the paper is that simply continuing to scale up LLMs will not produce understanding, because a pure LLM has no connection between form and meaning. That link can provided in other ways, though (multimodal models, robot embodiment).

dekhn 3 years ago | | |

But these language models are implicitly trained on knowledge by being fed large amounts of factual text, which (I presume) allows it to generate text that is factual (statistically more frequently than hallucinating nonfactual information). So probably recent models (which were being trained around the time the parrots paper came out) are really implict knowledge models already. Obviously they don't have embodiment, and it's still unclear to me what level of true embodiment in the actual, real, physical world is required to make these models more than just "parrots".

renewiltord 3 years ago |

In the end, it turned out the actual innovation was doing the opposite of what this paper recommended: scaling up the LLM, improving quality by throwing lots of data at it rather than curating, and limiting bias by RLHF rather than picking the right datasets.

The organizations that listened to these people for even some amount of time got hosed in this situation. Google managed to oust this flock from within but not before their AIs were so lobotomized that they are wildly renowned for being the village idiot.

Ultimately, this paper is a triumph of branding over science. Read it if you'd like. But if you let these kinds of people into your organization, they'll cripple it. It costs a lot to get them out. Instead, simply never let them in.

jal278 3 years ago | |

The long-term impact of this paper has confused me from a technical lens, although I get it from a political lens. I'm glad it brings up the risks from LLMs but makes technical/philosophical claims which seemed poorly supported and empirically have not held up -- imo because they chose not to engage with RLHF at all (which was deployed through GPT-3 at the time; and enables grounding + getting around 'parrotness'), and uses over-the-top language ("stochastic parrot") which seems very poorly to capture what it feels like to meaningfully engage with e.g. models like GPT-4.

cratermoon 3 years ago | |

> limiting bias by RLHF rather than picking the right datasets

This is the same as curation and picking out the dataset, except as post-processing. The reason why RLHF has to happen (and traumatize the people <https://www.bigtechnology.com/p/he-helped-train-chatgpt-it-t...>) is to address the problems by censoring the model.

torginus 3 years ago | | |

Is it though? If you wanted to teach humans so that they don't develop unfortunate beliefs, would it be a good approach to just keep them from reading material that you find objectionable?

If you read a book that you disagree with, or one that contains falsehoods and bad reasoning as far as you can tell, would that make you believe those things?

Sunhold 3 years ago | | |

The word "trauma" is getting overused. The idea of someone being traumatized by reading fictional text is just silly. It's unpleasant or gross at worst unless you already have other issues.

the8472 3 years ago |

The first step to defeating a tiger is to realize that it cannot hurt you, for it is only made of simple atoms.

cubefox 3 years ago | |

"Machine learning? It's just statistics bro."

rchaud 3 years ago |

I've got another word for it: recipe-fication.

Everything we revile about online recipe websites that spend 1000 words about the history of cooking before getting to the point, will be part and parcel of AI-written anything. It won't be properly proofread or edited by a human, because that would defeat the purpose.

adamsmith143 3 years ago |

Yoshua Bengio, Andrew Ng, Anrej Karpathy, and many other of the top researchers in the field do not believe these models are stochastic parrots, they believe they have internal world models and prompts are methods to probe those world models. Stochastic parrots is one of the dumbest takes in AI/ML.

cubefox 3 years ago | |

Yeah. See e.g.

https://arxiv.org/abs/2306.03341

> Our findings suggest that LLMs may have an internal representation of the likelihood of something being true, even as they produce falsehoods on the surface.

The problem here is that there is currently no reliable way to extract information from this hypothetical world model. Language models do not always say what they "believe", they might instead say what is politically correct, what sounds good etc. Researchers try to optimize (fine-tune) language models to be helpful, honest, and harmless, but honesty ("truthfulness") can't be easily optimized for.

dehrmann 3 years ago |

Something good that came out of crypto was a lot of people thought about what money actually is. LLMs are doing the same with intelligence.

deeviant 3 years ago | |

Eh, the thing I feel most people (who lost a lot of money on crypto) learned about what money is, is that crypto is not money.

pydry 3 years ago | |

It wasn't particularly deep thinking though. The same is also true here.

brandly 3 years ago | | |

Regardless it's a good thing! Many people have had no reason until recently to break out of thinking that money=usd or intelligence=humans.

IshKebab 3 years ago | |

Yeah but they don't seem to be thinking about it very much. People keep spouting "stochastic parrot" nonsense!

api 3 years ago |

I’d argue that all these models are stochastic parrots because they’re not embodied in any way. There is no way they can actually understand what they are talking about in any way that is tied back to the physical world.

What these LLMs and diffusion models and such actually are is a lossy compression method that permits structural queries. The fact that they can learn structure as well as content allows them to reason as well, but only to the extent that the rules they’re following existed somewhere in the training data and its structure.

If one were given access to senses and memory and feedback mechanisms and learned language that way, it might be considered actually intelligent or even sentient if it exhibited autonomy and value judgments.

Invictus0 3 years ago |

Feels like this wikipedia page is overly (self-?)promotional of the paper and its authors

constantcrying 3 years ago | |

One massive flaw of the Wikipedia modell is that the people who edit Wikipedia the most "aggressively" are the ones with the most emotional investment in the topic.

This can lead to very detailed articles written by very enthusiastic people. In other cases the people who are very pro/against the subject will be the ones who put in the most effort, especially on smaller/controversial subjects.

I have seen Wikipedia pages which basically read like ads for small companies.

ttpphd 3 years ago | |

Considering that men are taking credit for their work, maybe some over-correction is understandable.

GaggiX 3 years ago | | |

Which men are taking credit for the work?

hackandthink 3 years ago |

A nice paper:

"Meaning without reference in large language models"

"we argue that LLM likely capture important aspects of meaning, and moreover work in a way that approximates a compelling account of human cognition in which meaning arises from con- ceptual role"

https://arxiv.org/pdf/2208.02957.pdf

I remember Quine's meaning holism it seems to be related.

https://en.wikipedia.org/wiki/Semantic_holism

RHSman2 3 years ago |

What do you think parrots think about this? Insulted.

cubefox 3 years ago |

GPT-3 was released less than a year before that, even though this now seems to be long ago. Time is moving fast with AI.

ChatGTP 3 years ago | |

Climate change moves fast too, what’s your point ?

aaroninsf 3 years ago |

TL;DR: the focus on the implementation details, and descriptions like this, are detrimental, even perilous,

because such accounts are both accurate, and deeply misleading.

This is description, but it is neither predictive, nor explanatory.

It implies a false model, rather than providing one.

Evergreen:

Ximm's Law: every critique of AI assumes to some degree that contemporary implementations will not, or cannot, be improved upon. Lemma: any statement about AI which uses the word "never" to preclude some feature from future realization is false.

koalala 3 years ago |

From the article: A "stochastic parrot", according to Bender, is an entity "for haphazardly stitching together sequences of linguistic forms … according to probabilistic information about how they combine, but without any reference to meaning."

It seems to me that the great success transformers are now enjoying is precisely due to the fact that 'probabilistic information about how they combine' _is_ meaning.

NoGravitas 3 years ago | |

It's really not. Read the National Library of Thailand thought experiment to understand the difference. But this isn't saying that AGI is impossible, only that it can't come purely from LLMs, and that pure LLMs will remain stochastic parrots no matter how they are scaled up.

IshKebab 3 years ago | |

I agree. There's a quote in that paper about how ML models can never access meaning (semantics of words) because they only see the form (syntax and letters) and the two are somehow completely divorced.

It's obvious nonsense. I can describe a new concept to you using only words and letters and you can understand it. Therefore you can build up knowledge using only syntax.

Nobody is saying that LLMs understand the layout of a bus or the feel of leather, but they understand that buses are vehicles with four wheels that transport people etc.

Face-slappingly poor philosophy.

nologic01 3 years ago |

Rehashed language imitating sequences is a term that does not denigrate parrots.

browningstreet 3 years ago |

“stochastic” is to the tech forum as “sapiosexual” is to the online dating profile

constantcrying 3 years ago |

This also relates to vision models. The existence of adversarial attacks (e.g. imperceptable changes in the image drastically changing the output) essentially demonstrate that the model has not reached the point at which the network "understands" the generalized concept it wants to disinguish.

zone411 3 years ago | |

The same argument could apply to humans. For example https://en.wikipedia.org/wiki/Change_blindness.

cubefox 3 years ago | | |

That's something else. The OP was talking about small changes in pictures causing a very different classification.

constantcrying 3 years ago | | |

Not really an example, there are many ways human vision is flawed and can be tricked, but none are on the level of these adversarial examples. There imperceptible differences between an image lead to a category error.

Human perception can be ambigous, but minimal changes never cause drastic category errors.