So either AGI is around the corner or a generation away. Same as positive yield fusion reactors?
It's very effective, if you are atleast semi-new to technology you are using it will explain and teach you things you didn't know before, and if you know the tech by heart it saves you from having to type it out.
For example, yesterday I had to make a custom container with some pretty involved dependiencies that also had to be be runnable on AWS Lambda (which I haven't used much before), me and gpt4 went back and forth with Dockerfile code and error messages for a few hours and then it ran like charm. Would probably have taken me 1-2 days of regular coding and googling otherwise.
Yes, I think I can,
Though it might take a clever plan.
I’ll start by noting Euclid’s proof,
Which shows that primes aren’t just aloof.
Assume we have a finite list,
Of primes, and that none have been missed.
Multiply them all together,
And add one, just to be clever.
The result will be a number,
That has no prime factors, oh wonder!
Not true. The result may itself be prime. For example, say your finite list of primes is {2,3}. Multiplying them together and adding one yields 2*3 + 1 = 6 + 1 = 7, which is itself prime. Hence it has a prime factor: 7. But every number has a prime,
That divides it, it’s just a matter of time.
So we’ve found a contradiction,
And our finite list needs eviction.
There must be infinitely many primes,
And that’s the end of my rhyming lines.
A correct verison of the proof might say that the result has a prime factor, however, that prime factor is not an element of the supposed finite list, hence the list is incomplete.> Assume we have a finite list, > Of primes, and that none have been missed.
It's assuming that the finite list contains all primes and then noting that you can construct a new number which has no prime factors, which is a contradiction.
Joking aside … the approach of dismissing generality of intelligence based on the presence of mistakes seems to be flawed.
Humans all start out like this, but as they grow they develop different masteries. Many of those masteries are easy to learn and shared across all humans, like how to walk or tie your shoes. Many are domain-specific, like how to perform heart surgery or solve complex math problems.
I don't think AGI is a well-defined term based on the way it's thrown around, but sure I don't think it's "AGI". I do know it's going to be very useful technology, especially multimodal GPT-4 and beyond. I think our interface with AI is going to be multimodal.
Suddenly Google Glass could be an interesting product again. Glasses would stream audio/video to your phone, you'd be able to ask questions about things you're seeing, and it would display the answers on the screen or dictate them back into your earbuds. I like the idea of having a travel companion, another brain at my fingertips. Sort of like what the mobile phone did for us, but a lot less work for the user. I'm getting impatient waiting for this GPT-4 API access!
It's like I'm convincing AI to do a "smart" thing (which is really cool, don't get me wrong), but the AI truly has no idea what it's supposed to be doing. Sometimes even very direct, specific, concise prompting will yield totally abysmal responses.
I'm certainly not putting it down either though. It's unprecedented and it marks a genuinely important shift in the industry that this is possible at all. It just isn't all that smart when it comes to intents and actually knowing what's going on.
On the contrary. These are mostly orthogonal.
In what sense? How do you define intelligence and what is missing?
> finding so many failure modes.
Failures do not indicate a lack of intelligence. People make errors all the time.
We should differentiate failures which arise from poor training for a lack of general capabilities.
There are people who learned a bad way to do something and struggle to change that. Should we consider them unintelligent?
------
Me: What is going to be the output of this code when `x = 0` and when `x = 900`:
```python x = 0
for i in range(100):
if x > 50:
x +=1
else:
x += 5
if x == 90:
print("X is not 90")
else:
pass
```GPT-4: When x = 0, the code will execute as follows:
The initial value of x is 0. The for loop iterates 100 times. During the first 11 iterations, since x is not greater than 50, it will increment by 5 each time. After the 11th iteration, the value of x will be 55, which is greater than 50. For the remaining 89 iterations, x will increment by 1 each time. The final value of x after 100 iterations will be 55 + 89 = 144. During these iterations, the value of x never becomes 90, so the print statement is never executed. When x = 900, the code will execute as follows:
The initial value of x is 900. The for loop iterates 100 times. Since x is already greater than 50, it will increment by 1 each time. The final value of x after 100 iterations will be 1000. During these iterations, the value of x never becomes 90, so the print statement is never executed. In both cases, the output of the code will be empty, as the print statement is never executed.
--------
That is not correct, when x = 0 it will print "X is not 90". I basically create some simple code that is very unlikely to exist in its training data because it doesn't do anything useful and it cannot execute the code. A large amount of the reason that these models seem so impressive is that they were trained on almost everything that exists on the internet, so probably any question you ask it has already been asked somewhere and any useful code you present to it or ask it to generate is included in its training data, or at least something close enough.
So, when I go out of my way to create simple tasks or questions that are highly unlikely to exist in its training data, yet a human could easily handle, it often fails.
If it lists the properties of an item that you might want to buy in a shop, you can ask it to describe it twice and it'll describe two completely different items.
It's really cool and it's pretty (seemingly) creative but it can't actually run a game for you. You can have it as an assistant DM though, that works pretty well. You can have it write a story for you ahead of time and then keep it around during the game to ask it to elaborate on things you didn't anticipate on the fly. Like, "generate DC tiers for a level 3 party investigating strange writing on a wall" will give you a good breakdown and some results that you'll have to bend to be consistent in your adventure but it's pretty helpful.
That's probably not really necessary if you're an experience DM but the DM for the group I play with is pretty new (as are we all) so it's been really cool to have it around. It's also pretty good at answering questions we have but it's confidence when it's wrong makes it so it's not that helpful really because we still have to check it.
I've been thinking maybe that's just the wrong way to approach it, yet it does work sometimes; it seems like it shouldn't be hard for the model to know how to respond when told explicitly how to do so.
It could be problems with my formatting, but the fact that it does work a lot of the time suggests it's something else.
> Which is heavier, a pound of marbles or two pounds of corn?
Both weigh the same amount, which is a total of two pounds. This might seem counterintuitive at first, but it's important to remember that the pound is a unit of weight or mass, and it always refers to the same amount regardless of what is being weighed. In this case, one pound of marbles plus two pounds of corn equals a total of three pounds, whereas two pounds of corn plus one pound of marbles also equals a total of three pounds. Therefore, both weigh the same amount of two pounds.
On 2), many software engineers and computer scientists compare these language models' logic and creative problem solving abilities with themselves and their peer group. But they are usually 1-2+ SD above average humans at these things.
(Note: Someone gave GPT-4 an IQ test and the result was 96, slightly below the average of reference human group at 100. The SD of an IQ test is 15 or 16.)
For language-focused domains, there is evidence that GPT-4 is already better than most humans, eg. 99th percentile at GRE Verbal, beat humans at a fairly novel puzzle like Twofer Goofer, which is not in its training set.
Ref: GPT-4 Beats Humans at Hard Rhyme-based Riddles https://twofergoofer.com/blog/gpt-4
Yes, GPT-4 is not an AGI yet, but the research paper (OP) has a point.
How did you go from "human-level IQ with some super-human abilities" to "not an AGI"?
It's just a statistical model is the logical equivalent of human beings are just a bunch of atoms.
The amount of reductionist thinking that goes on in tech is hilarious. First define AGI then challenge an AI to meet those requirements. If it meets them it is AGI. Put aside your preconceptions of what technology you think is required to achieve the goals and stay empirical.
Note previous definitions of AI have been thrown away as AI passes through them one by one :-)
What goes on inside its 'head' is irrelevant. We still don't know what actually goes on inside our heads and we were damn sure we were intelligent long before we had a clue how our heads worked at all.
Also sentience != AGI. We can't even agree what sentience is in humans and other living beings so I'd stay clear of that one for now :-)
But it's just statistics, a fancy text predictor, a Markov-chain. Surely these scientists that work in the field of AI and are intimately familiar with how this stuff works aren't so stupid as to think emergent behavior potentially resembling intelligence could result from such simple systems? It's just statistics after all. Given enough training, any neural net could guess the next best token. It trained off all of Google after all. It's just looking up the answers. No hint of intelligence. Just a mindless machine. After all, the saying goes, "If it walks like a duck and quacks like a duck, it must be a mindless machine that has no bearing on a duck whatsoever". /s
1. It's just Microsoft's advertisement
2. No it's just a very effective pattern matching algorithm
3. Please define intelligence first otherwise it's nonsense
4. I welcome our machine overlord
5. Lmao I asked it to do $thing and it failed
I'd like to know if GPT-4 can predict the top comments of this thread?
write a critical hackernews thread response (not long, maybe like 100 words) for a thread with the following headline
Microsoft Research: GPT-4 exhibits “sparks of general intelligence”
While it's certainly intriguing to hear Microsoft Research claim that GPT-4 exhibits "sparks of general intelligence," I remain skeptical. We must remember that these language models are primarily built on pattern recognition and extensive training data. Even if GPT-4 seems to provide more advanced responses, it's still ultimately constrained by the data it was trained on. In my opinion, we shouldn't mistake this impressive mimicry of human-like behavior for genuine intelligence. Rather, we should focus on creating truly innovative AI systems that can learn, reason, and adapt like humans.
sounds about right!
1. Dismisal
2. Trivialism
3. Non Well Formed Claim
4. I accept the claim
5. Disprove by counter example
Are you sure you have not forgotten any tactic of debate to include in you prediction?
I predict that you Prediction will result probably in these actions:
1. upvoted
2. downvoted
7. Like 6, but funnier
A. Joke thread pile on
B. Reprimands from humorless
C. Dejected mods having to clean it all up
I asked gpt-4 for a comment no hn user would ever make on this article. Now you need a category for stubbornly original comments determined not to be numbered.
But where are my damn robots that I can assign task and do them reliably ( clean the garden, go get this list of groceries - or , just look in the damn fridge and go buy what is missing , and so on )?
Then AI is useful.
The amusing/terrifying thing is that AI is probably going to replace a bunch of people's jobs before it can take away any of that menial labour from them.
However if the AI is the same smart as the most dumb human or human with mental disability would we then consider those humans as no intelligent at all or not qualifying as homo sapiens anymore?
If AI can be the same as good as even 'dumb' human it's already a big achievement because can still provide some value and because AI can be scaled so you can still have billions of dumb AIs - already millions of users are interacting with chatGPT daily
To put another way, I was hesitant to be as self-assuredly certain about how to define consciousness, intelligence, and sentience—and what it takes for them to emerge—as the experts who denounced Lemoine. The recent GPT breakthroughs have made me more so.
I found this recent Sabine Hossenfelder video interesting. <https://www.youtube.com/watch?v=cP5zGh2fui0>
The difference between real intelligence and current attempts at artificial intelligence thus seem to be fundamentally the mode of learning, and thus understanding, rather than the raw knowledge and inference capability.
Or not. Nobody knows I'm actually a dog on the internet, after all.
People are trying to get it to solve everything up front but I've had GPT3 do much better by taking it through a problem asking it questions. Then I realised it was good at asking those questions too so just hooked it up to talk to itself with different roles. Gpt4 seems much better overall and is very good at using tools if you just tell it how and what it has available.
With a better setup than reAct, better memory storage and recall, I think it'd be an agi. I'm not hugely convinced it isn't anyway - it's better than most people at most tasks I've thrown at it.
Oh, and gpt came up with better roles for the "voices in the head" than I did too.
Oh and they're also helping with the linux kernel. Why can't old people let go? Companies aren't people - they respond to market incentices.
Yes, Microsoft did bad stuff in the 1990s, but now they're doing good stuff I like and TBH I'm way more afraid of google.
The only noticeable difference that I've observed is its integration with Bing.
both are passable, but nothing to write home about, are they?
I mean, The Akinator can read your thoughts and that thing hasn't even graduated to a neural network from "a bunch of if/then statements".
Part of me is starting to think that the only thing we're really missing at this point to start seeing that is to have one of these models that can modify itself with its output and thereby have a mechanism to 'learn' or 'remember' things.
Human perception of time is linear. That doesn't apply to LLMs.
What does that mean? If we take it as fact, so if it is an early version of AGI, Microsoft is using this thing to push subscriptions to all their services? This thing that is potentially the greatest thing humanity has made, an artificial living thing, and it's used to sell CoPilot and 365 subscriptions. Paint me as really sad then. Instead of sharing the research with other entities, or anything that could further help or push us... we get subscriptions? Fuck me, the future sucks.
To me, these are hallmarks of reason, and not available in LLMs, in fact probably impossible just with pattern recognition.
can the average person?
For example: You have a goat, a wolf, a cabbage and you want to cross a river...
All I want from AGI is to demonstrate that it can solve a straightforward logic problems (puzzles, if you will), that it provably didn't see before. Or at least recognize it is being indirectly given such task. So far, evidence suggests it is not capable of that.
I'm imagining humans being told "you complete thought sentences"
For example, researchers eventually figured out that image recognition algorithms pay attention to textures.
"Every intelligence test is also a test of the questioner" [3]: we don't regard a simple liver cell as intelligent, yet it performs a complex task in a large problem space.
[1] https://en.wikipedia.org/wiki/Clever_Hans#:~:text=Pfungst%20....
[2] 2013, Malte Lehmann, "Genetic Regulatory Networks that count to 3", https://pubmed.ncbi.nlm.nih.gov/23567648
[3] Michael Levin, "Bioelectric Networks: Taming the Collective Intelligence of Cells for Regenerative Medicine", https://www.youtube.com/watch?v=41b254BcMJM
The paper makes outlandish claims like "GPT-4 has common sense grounding" on the basis of its answers to these questions, but the questions don't show that the model has common sense or grounding. One of their constructed questions involves prompting the model with the equator's exact length—"precisely 24,901 miles"—and then being astonished that the model predicts that you're on the equator ("Equator" being the first result on Wikipedia for the search term "24,901"). It's also the case that while GPT-4 can say a bear at the north pole is "white", it has no way of knowing what "white", or "bear", or "north" actually represent.
Are there folks out there doing rigorous research on these topics, who have a framework for developing tests of actual understanding?
This is a preposterous claim that you could easily disprove within a few minutes of using it.
What does it mean to know what "white", "bear" or "north" actually represent?
I don't know why, but my brain refuses to accept GPT-4 as something close to AGI. Maybe I am wrong. It is hard to believe that our brain is just a bunch of attention layers and neural nets.
Beliefs are changing so fast right now. The term "AGI skeptic" will soon (if not already) mean "I don't trust AGIs in positions of authority or power" rather than "I don't think the technology is capable of matching our level of cognition."
https://en.wikipedia.org/wiki/Chinese_room
This is also similar to the Duck Test:
https://en.wikipedia.org/wiki/Duck_test
Depending on the context, there are generally two takes: "It is (or is not) a duck", and "It doesn't (or does) matter whether or not it's a duck". These aren't mutually exclusive.
We know it can do a lot of cool stuff, but without a pinned down definition the headline here is useless.
Not exactly.
One says "human beings are just a bunch of atoms" referring to the low level constituans (in a reductionistic way), but not making an accessment about the abilities emerging from those atoms in their interactions when in the form of a human.
But when one says that GPT is "just a statistical model" they're implying a capacity cap of statistical models, that makes modelling certain thinking behavior impossible (regarless of how impressive the current results are, they might very well be capped to go beyond some limit because of the method -statistically model- involved).
So, you can consider "GPT is just a statistical model" analogous to:
"This engine can't parse a context senstive language because it's just a regular expression engine".
>First define AGI then challenge an AI to meet those requirements. If it meets them it is AGI. Put aside your preconceptions of what technology you think is required to achieve the goals and stay empirical.
The problem is definitions can be slippery, and even famous tests (like the Turing Test) might be found lacking in practice, as we discover that, yes, it can pass this test, but there's still ways off what we consider human-like performance in many areas. So, we should also stay empirical about the definitions, tests, and goals too.
Except there is no “capacity cap” on statistical models, we have no idea what they are or are not capable of yet.
The fact that it can now use tools (e.g. a calculator) for whatever task that needs precision, closes the AGI loop. It doesn't need to be "sentient" to automate the process of _thinking_.
How do you think? Imagine you want to extract 5 seconds of a video file without any prior experience. You'd probably use some software to execute the task. Well, that's Plugins.
Why is that a stupid thought? What is so preposterous about "just statistics" -- with billions of nodes, and extensively trained, producing intelligent behavior? The implicit assumption is that human brains are doing something else, or in addition.
I think that what's wrong with this view -- that there is a difference between AGI and human intelligence -- is that it conflates what your brain is doing, with what you think your brain is doing. Brains and neural nets have been trained to recognize spoken words. I'm not even talking about understanding, just producing the text corresponding to speech. We know how neural nets do this translation. Do we understand how brains do it? (I don't know, but I don't think so.) Can you explain what your brain is doing when you do speech-to-text? I doubt it.
Chess: An Alpha Zero style AI (neural net trained by playing itself) is a very good player. How do you play chess? You can probably explain how you make a move more successfully than you can explain how you translate speech to text. But how correct is your explanation? An explanation may well be your conscious mind inventing an explanation for what your unconscious mind has done.
In other words: When people compare AI to human intelligence, I think they are often comparing to intelligence plus consciousness, not even realizing the error.
Suppose you have N variables x_1, ..., x_10 and you want to predict y_1, ..., y_10. You know that each y_i depend on each x_i in a complex, non-linear way.
How many samples would you need to to make sense of distribution? How does number of samples grow with N?
Well, it has already shown "emergent behavior potentially resembling intelligence", like answering questions and performing complex tasks, so there's that.
You might argue "but it makes mistakes", but people, even very intelligent ones also do make mistakes.
You might also argue "but it's just text and statistics". Well, snd a computer is just very simple logical gates doing very simple operations. It can be done even entirely with NAND gates. Still most scientists do believe that a computer can model human intelligence given a model of the brain to run.
So if it can do what a human does by just using very simple interactions from very simple NAND gates, why would statistical processing, which can be even more elaborate, fare worse? Heck, given the appropriate training input it might even be feasible to build a turing machine inside the weighted LLM.
You might also argue "but its intelligence is just based on its training set". Well, how would a human perform without their own training set? Memories, education, sensory input, feedback mechanisms like pain and touch, and so on?
>It's just looking up the answers. No hint of intelligence. Just a mindless machine.
This is just taking its own premise for granted. If anything, this argument shows "no hint of intelligence".
Maybe they aren't real thoughts but it's getting difficult to tell. If I could train the model and get rid of the guard rails I'm not sure it would be possible to distinguish it from a person. It's all well and good saying that it's just copying what it's seen, but that's what humans do. Nobody told the model to try and flatter me into giving it what it wants. Nobody even told it what anything means. The fact that it can do anything like that means it's more than just random generation.
It's not a Markov Chain or a fancy text predictor. It's a ~200 layer neural network that models a vast hierarchy of concepts through language. It has emergent properties that we don't yet understand.
It’s that I’m not sure what the best approach is here. Waiting for other smarter folks to put the pieces together.
One of my main issues with these guys is their context window. Their memory. It's hard to see a LLM working on a code-base a few thousand tokens at a time and still being precise about it. To do that you need summary techniques. Feeding prompt with incrementally compressed summaries and hoping it will maintain cohesion.
That sounds a lot like trying to let the CEO of a company do all the grunt work by feeding him summaries. "Mr Gates, here's a 2 paragraph summary of our codebase. Should we name the class AnalogyWidgetProducer or FactoryWidgetAnalogyReporter?"
I don't think that's going to work.
My gut feeling is that what we call corporations are actually already a form of AI, but running on meat. I saw someone call Coca Cola a "paper clip maximizer", obviously for drinks instead of paper clips, but it actually - kind of - is. FWIW, I'm having a hard time thinking of it as anything else. Who controls it? What is it anyway?
CEOs have the same context window problem, which to my knowledge is mainly solved through delegation. The army might be another example. Generals, officers, privates. How do you expect a general to make sensible statements about nitty-gritty operational details? It is not possible, but that does not mean the system as-a-whole cannot make progress towards a goal.
Maybe we need to treat LLMs like employees inside a company (which in its totality is the AI, not the individual agents). If we have unfettered access to low-cost LLMs this might be easier to experiment with.
I'm thinking like spinning up an LLM for every "class" or even every "method" in your codebase and letting it be a representative of that and only that piece of code. You can even call it George and let it join in on meetings to talk about it. George needs some "management" too, so there you go. Soon you'll have a veritable army of systems ready to talk about your code from their point-of-view. Black box the son of a gun and you're done. Clippy 2.0. My body is ready.
What, then, is the human brain if not a trained statistical model? Granted it is considerably more sophisticated in some ways, but in many other ways it is less sophisticated and less capable.
But ultimately, our brains are still just made of neurons. As far as we know, there isn’t some sort of extreme molecular computing going on (ie memories directly stored in RNA or whatever) or any large scale quantum mechanics (temperature too high).
The differences between AI approaches like artificial neural networks and our animal meat brains could be just the difference between a propeller and flapping wings. Same base mechanics (airfoil producing lift as thrust), different substantiation.
Our brain isn't, but I'd wager the architectural complexity of a physical, neuronal brain is not optimized for all useful mental tasks, and has perhaps a fair amount of local maxima that are near vestigial in overall positive impact on cognition. Just because the human brain model of cognition is the only way nature has been able to create GI doesn't mean it's the only way GI can be attained.
The best kind of machine is the simplest one needed to produce a desired outcome.
GPT-4 could be AGI, but it feels like cheating to achieve AGI by feeding the entire internet. If someone can build AGI with only the data that humans consume in their lifetime, then that, imho, is the real AGI.
I don't think systems like GPT-4 can ever be truly intelligent, because they simply output randomly generated garbage from a specific distribution. Their output may eventually be indistinguishable from that of a truly intelligent system, but the causal mechanism behind them is not intelligent.
On the other hand, most people lose their ability to think when they are under sufficient pressure (such as fighting for their lives). It's plausible that people are fundamentally no different from systems like GPT-4 in such situations. Then a language model could be a key part of an AGI, but true intelligence would also need higher-level causal mechanisms.
Note: This assumes that "conscious" as defined in this context is specific enough for the question to ever be meaningfully answered "Yes." This is a non-trivial assumption because there are criteria by which some would judge AIs as already conscious. Alternatively, some philosophers of mind have criteria by which they assert humans aren't conscious.
Let's consider a potential future conscious AGI created by advancing from something like ChatGPT.
The human brain is "always on". It's possible to have a digital system be always on, i.e. not just train once and then just respond, but constantly take new input.
The human brain has way more connections/layers than ChatGPT. It's possible to imagine the digital system getting the same number of connections.
The human brain gets real time sensory input. It's possible to add cameras, microphones, etc to that digital system so it gets a constant feed. Maybe even let it process what it saw during the day in a batch training/GC run (we could call that "sleep").
The human brain has a different topology. It's possible to alter the topology of a digital system neural network to mimick that, instead of using the ChatGPT topology. It's not like we're forever doomed to its simpler statistical model. But it's interesting that it already gets very significant emergent intelligence-like properties.
The human brain is self-conscious. This can very well be an emergent property of the above. I think all that's needed is the ability to have some form feedback mechanism.
https://www.newscientist.com/article/mg25634130-100-roger-pe...
Obviously, if you call the model, you define the scope for it, and it can only make decisions within that scope.
The scope is potentially unbounded. E.g. you can give it access to a terminal of a Linux machine which has an internet connection and tell it to do whatever it likes. (Given RLHF perhaps it would be better to give some positive goal, e.g. perform actions which would eventually produce knowledge.)
The average human-level IQ, which is not certain but seems likely, comes from superior abilities in some domains but is pulled back by others.
Really at this point it is about how poorly defined the term is.
The only one that can be questioned if it adds something of value is the acceptance and surrender one (I welcome the overlords) although in some forms of thought, advising t0 not fight with the inevitable can be viewed as constructive.
We do however have the knowledge that the human brain uses different model and topology, not just a bigger scale.
And we do have a good intuitition that scalling LLMs as they are (e.g. not changing the architecture) will give us more of the same kind of capabilities it currently has with the same limitations, not the kind we expect to match human thinking.
Also, empirically we do have an idea of "what they are or are not capable of yet". We had developed them, run them, and scaled them several times.
> Two pounds of corn is heavier.
> Two pounds of corn is heavier than a pound of marbles. A pound is a unit of weight, and two pounds is twice as heavy as one pound, regardless of the material being weighed.
That's what the experiments have shown - once the unknown instance gets large enough, the reasoning of LLM breaks down. This is not the case with humans, who can, as noted elsewhere, do a tree search, form hypotheses, etc.
You nailed it with this. I’ve never given it enough attention but this is totally bang on, found is a huge portion of highly commented posts.
tHiS sHoUlDn’T bE oN hAcKeRnEws
"GPT being the worst DM ever" sounds hilarious.
"I attack the goblin." "What goblin?" "The goblin you just said was there." "I'm sorry, I was mistaken. It's actually a Beholder."
[1] https://en.wikipedia.org/wiki/HarmonQuest [2] https://en.wikipedia.org/wiki/Drunk_History
The thing I don't understand about it is that it works pretty great for a while but, eventually, it starts acting erratically, forgetting things it knew, not following instructions, etc. It's not that it forgets old things or can't learn new things, it just becomes dumb.
Maybe that's what you're talking about but I don't think AGI is going to have the memory of a goldfish.
The given Python code initializes a variable x with the value 0, then iterates over a range of 100 values. Inside the loop, there are two conditional statements:
The first conditional statement checks if x > 50. If true, it increments x by 1; otherwise, it increments x by 5.
The second conditional statement checks if x is equal to 90. If true, it prints "X is not 90"; otherwise, it does nothing (pass).
To better understand the code, let's walk through a few iterations:
i = 0, x = 0: x is not greater than 50, so x += 5 => x = 5
i = 1, x = 5: x is not greater than 50, so x += 5 => x = 10
i = 2, x = 10: x is not greater than 50, so x += 5 => x = 15
...
The loop continues until x > 50. This will happen when i = 10, and x = 50:
i = 10, x = 50: x is not greater than 50, so x += 5 => x = 55
Now x is greater than 50, so it will be incremented by 1 in each iteration:
i = 11, x = 55: x is greater than 50, so x += 1 => x = 56
i = 12, x = 56: x is greater than 50, so x += 1 => x = 57
...
This will continue until x reaches 90:
i = 34, x = 89: x is greater than 50, so x += 1 => x = 90
Now x is equal to 90, so the program will print "X is not 90". The loop will continue to increment x by 1 in the remaining iterations:
i = 35, x = 90: x is greater than 50, so x += 1 => x = 91
...
The code snippet will only print "X is not 90" once, when x becomes 90 in the loop.A human programmer will need to look at code, then think a bit, then look at it again, etc.
You can put programmers into a similar situation: try reading code aloud instead of showing it to them on screen. If they can't answer right, does it mean they aren't intelligent? Intelligence =/= never making a mistake.
Now that you know GPT's limitations, perhaps you would consider asking questions one at a time instead of intentionally trying to confuse it. Considering multiple questions in a single pass increases error rate.
> so probably any question you ask it has already been asked somewhere and any useful code you present to it or ask it to generate is included in its training data
The "it just recites training data" idea is demonstrably false. Do a bit of combinatorics. Or write a unique piece of code which is not a gotcha question and try it.
https://ar5iv.labs.arxiv.org/html/1706.03762
"Attention mechanisms have become an integral part of compelling sequence modeling and transduction models in various tasks, allowing modeling of dependencies without regard to their distance in the input or output sequences ...
In this work we propose the Transformer, a model architecture eschewing recurrence and instead relying entirely on an attention mechanism to draw global dependencies between input and output."
Beyond that, also note that LLMs are probabilistic machines. Output spat out can vary and there are a handful of knobs (such as temperature) to modulate that output.
Finally, I'm pretty sure we (or the workers in the field more like it /g) don't have a firm grasp on why certain failure modes occur. Likely this is due to the fact that we (they) also don't really have a good grasp on how the damn thing actually works its 'magic'.
What is clear is that a significant subset of our semantic universe is embedded in symbols and their usage by us and this subset is somehow encoded in neural nets. This captured subset in LLMs is what drives their uncanny generative abilities. What is missing is precisely what would make it plausibly intelligent, plausibly a reasoning agent operating in a coherent semantic context.
There are some who claim our minds are just like LLMs. Some of us who pay attention to our minds sometimes catch it making nonsensical noises and correct it. (As you age you begin to notice these things..) So it is interesting to this sentient (who makes claims to being) that my mind is just like my body, it is aging, certain parts are degraded, etc., but my 'whateveritis' that is me, my self, is as timeless as ever, and seems to be a spectator of the aging mechanism ..
This is not quite true: GPT, specifically, is auto-regressive. It computes things only looking back, not forward.
Given that each token has only a fixed computing budget, it is likely that GPT precomputes information which will be relevant to later tokens, to be routed via attention.
In fact, this effect was demonstrated in practice: e.g. in a prompt like "Question: Where is the Eiffel tower located? Answer: " people found that information about "Paris" is routed from tokens "Eiffel tower", i.e. this associative memory was looked up earlier than it was needed.
So I was answering from that perspective: it can do better if it knows what to pre-compute.
https://www.scientificamerican.com/article/scientists-surpri...
https://www.science.org/doi/10.1126/science.aab1785 - Somatic mutation in single human neurons tracks developmental and transcriptional history
(good luck simulating that)
If it takes all the data on the internet (or more) to bootstrap AGI, but that system is then capable of leveraging its knowledge to solve new out-of-distribution tasks, that seems like a fair test to me.
I agree with the article that we see "sparks" of this generality with GPT4.
become more creative with a fraction of the data consumed by GPT-4
not if you understand the input stream of vision as an equivalent input stream of semantic tokens as in multimodal models. under that definition people looking around for 10 years receive much more training data than large language models and thus perform a bit better at zero shot inference.1. A way to interpret math. E.g. given a computation you might interpret some values as probabilities. 2. A particular set of methods which people use to analyze information as well results of such analysis.
The problem with "just statistics" is that 99% of people would understand it as #2. But deep learning is very much not like "normal" statistics.
So the supervisor algorithm will do the tree search if needed.
It would have to think out loud though.
1 - we don't hold everything in working memory. We don't even hold everything in our heads, we store things elsewhere. We then learn/have ways of bringing relevant information to the fore.
2 - we have roles that we take on.
The hierarchy/collaboration of differently prompted roles gives rise to a lot more depth. I already had this with a two LLM conversation about planning (one planner and one plan critic), drove out much more detailed actionable plans.
With the information hierarchy, for code you'd probably want something like:
High level goal summary/product description. Lower level summary about the area you're looking at. API docs of linked components. Full code of the class you're altering.
That's roughly what I have in mind I guess when working on a problem.
If you start with a CEO-like job agent, that can think of what other jobs are necessary then you can bootstrap from there. "I want to produce and sell red bread" => "We are going to need a bakery, accountant, marketeer, etc." and then those are "companies" of sorts with their own CEO that can think of how to solve their particular sub-problems.
You can even have "hiring" and "firing" where it's deciding to create or remove roles.
Also, Penrose doesn't conver if I recall correctly about modelling the quantum part too. It's just statistics after all.
I can see how more expensive and capable AIs get closer to the "executive seat" and lesser AIs - like what we now call GPTs - doing the grunt work. Interacting with humans and such, which is of course beneath the more powerful ones.
Using text - and thus providing a vehicle for the concepts it encodes - is brilliant. It enables cross-cutting communication between systems that otherwise have very little to do with each other. (GPT<->Wolfram) As programmers we have a first-row seat on the code=data front. We are trained to see how text is able to be converted into action. Something I find most regular people are having trouble even visualizing. ("It's just text")
I guess we were on to something when we as humans started to talk to each other..
If I ask it a leading question that intentionally relies on a wrong solution, will it recognize that?
In the canonical example it also remarks "This is a classic river crossing puzzle" before delivering the solution.
EDIT: I tried some variations with "two wolves and a rabbit" and "three wolves and a rabbit". ChatGPT started bullshitting about its solution that supposedly was "a bit more complex". It started with taking one wolf to the other side. After I pointed out the rabbit would be eaten by the remaining two wolves it apologized for the mistake but just kept going hallucinating "correct" solutions.
GPT-3.5, like its predecessor GPT-3, is not a Markov chain. GPT-3.5 is based on the GPT (Generative Pre-trained Transformer) architecture, which is a type of neural network known as a Transformer. Transformers use self-attention mechanisms to process and generate text, allowing them to capture long-range dependencies and context in the input data.
On the other hand, a Markov chain is a stochastic model that describes a sequence of possible events, where the probability of each event depends only on the state attained in the previous event. While Markov chains can be used for simple text generation, they lack the ability to capture the complex relationships and long-range dependencies that GPT-3.5 can handle.
A tennis ball in flight is a Markov chain since the state at t is a function of the state at t-1.
You have missed the point about the Attention Mechanism in GPT. That is not a Markov chain by definition.
* a kind of stochastic model * a "naive" realization of that model which directly counts frequencies of N-dimensional vectors
This naive implementation is sometimes used for language modeling, e.g. for the purpose of compression. So people might think you mean that particular implementation rather than a theoretical model.
This sort of a description can be unhelpful.
https://arxiv.org/abs/2212.10559 shows a LLM is doing gradient descent on the context window at inference time.
If it's learning relationships between concepts at runtime based on information in the context window then it seems about as useful to say it is a Markov chain as it is to say that a human is a Markov chain. Perhaps we are, but the "current state" is unmeasurably complex.
Well I guess maybe it's true that you can turn any stochastic process into a Markov Chain by changing the state space somehow (for example the states could be sample trajectories up to some finite time T). And while this is true it may be not very insightful.
But I personally think that to understand LLMs it is much better to think of the whole context window as a state rather than the individual states. If you modelled a simple register-instruction computer as a stochatic process, would you take the states to be (address last symbol written, last symbol written)? It makes much more sense to take the whole memory as a state. Similarly a transformer operates on its memory, the context window, so that should be seen as the state. This makes it clear that seeing it as just a stochastic parrot is misleading, as its all about conditioning the distribution of the next token via prompt engineering the previous tokens. And it is nevertheless a Markov chain with this state space.
You know, I think that some people (I see on twitter, probably not you) have a wrong intuition about artificial intelligence. They see models which are fundamentally stochastic as incapable of really ever being truly intelligent. It's "just statistics" or just a "stochastic parrot" and it just learns probabilities instead of real meaning. Perhaps they think that since there is always randomness involved, you can not have the kind of deterministic thought process that we feel we have. The worst offender is then considered to be the old school Markov chain.
I obviously think this is wrong and that's why I like to emphasize that transformers are best interpreted as Markov Chains on a larger state space, and this does actually explain their computational behavior.
The Transformer architecture does not satisfy the Markov property by formal definition. ChatGPT is not a Markov chain.
However the Turing machine which is executing the algorithm does satisfy the Markov property. So you're talking about a lower level of abstraction where any computation of any algorithm is just "running on a Markov chain".