Artificial General Intelligence – A gentle introduction(cis.temple.edu) |
Artificial General Intelligence – A gentle introduction(cis.temple.edu) |
So some people like to repeat. Yet, outside of the hand-picked examples in the article (the 5th generation computer project? Blast from the past!) there are a whole bunch of classic AI domains where real progress has been achieved in the last few decades. Here's a few:
* Game-playing and adversarial search: from Deep Blue to AlphaGo and muZero, minimax-like search has continued to dominate.
* Automated planning and schdeduling: e.g. used by NASA in automated navigation systems on its spaceships and Mars rovers (e.g. Perserverance) [1]
* Automated theorem proving: probably the clearest, most comprehensible success of classical AI. Proof assitants are most popular today.
* Boolean satisfiability solving (SAT): SAT solvers based on the Conflict Driver Clause Learning algorithm can now solve many instances of traditionally hard SAT problems [2].
* Program verification and model checking: model checking is a staple in the semiconductor industry [3] and in software engineering fields like security.
Of course, none of all that is considered Artificial Intelligence anymore: because they work very well [4].
_____________
[1] https://www.nasa.gov/centers/ames/research/technology-onepag...
[2] https://en.wikipedia.org/wiki/Conflict-driven_clause_learnin...
[3] https://m-cacm.acm.org/magazines/2021/7/253448-program-verif...
Also, regarding search in gameplaying, I would argue the opposite: the trend is that breaking into bigger and more difficult domains has required abandoning search. Tree search is limited to small games like board games or Atari. In more open-ended games we see model-free (i.e. no search) approaches; e.g. AlphaStar and OpenAI Five, the AIs for Starcraft 2 and Dota 2, were both model free. So was VPT (https://openai.com/research/vpt) by OpenAI, which tackled Minecraft. Even in board games, DeepNash (https://www.deepmind.com/blog/mastering-stratego-the-classic...), a 2022 project by DeepMind similar in scale to MuZero/AlphaGo, had to abandon tree search because of the size of the game and the challenges of applying tree search to hidden information domains.
>> Tree search is limited to small games like board games or Atari.
To be clear, those are board games like chess and go (and shoggi).
That is just the confusion between AI and AGI - which may seem to be peaking for some reason (in the past few days it seemed like a zombie apocalypse). The so-called "AI effect" is said by some to originate from some "Ah but this is not intelligent" - but there where we wanted to solve a problem of replacing intelligent action, not of implementing it. Be contented with the rough discriminator "I would have had to pay an intelligent professional otherwise".
To speak about «powerful AI, with broad capabilities at the human level and beyond» Ben Goertzel adopted and popularized 'AGI':
https://goertzel.org/who-coined-the-term-agi/
Edit, update: this confusion is like having robotics and some people rising a "Ah but this is not Artificial General Ability".
I would contend that a 5-year-old has general intelligence, and therefore an AI system with the language and reasoning abilities of a 5-year-old has artificial general intelligence.
But the discriminator of having to "pay an intelligent professional otherwise" sets the bar very high. That implies AGI must be an expert in every subject, surpassing the average human. I'd prefer we use a different term for that, like "artificial superintelligence."
It’s been months! Give it a few years :)
Lol no.
What testable definition of general intelligence does GPT-4 fail that a good chunk of humans also wouldn't ?
If you can answer this then you have a point, otherwise I really beg to differ.
ReLU is not nearly at the same level of importance as backpropagation and the high-level theory of neural networks. Plenty of other activation functions can be, and are, used. ReLU is a fine default for most layers but isn't even always what you want (e.g. at the output), nor is it clear that ReLU is even the best choice for all hidden layers and all uses.
From a perspective that could be too local in time. But:
> ReLU activation functions
Why did you pick ReLU, of all? The sigmoid makes sense because of the aesthetic (with reference to the derivative), but ReLU in that perspective is an information cutoff. And in the perspective of the goal, I am not aware of a theory that defends it as "the activation function that makes sense" (beyond effectiveness). Are you saying that working applications overwhelmingly use ReLU? If so, which ones?
I think there's little evidence for this. What happened in the 1980s was the introduction of and overselling of expert systems. These systems applied AI techniques to specific problems: but those techniques themselves were still pretty foundational. This is like saying that because electricity was used for custom things, we started inventing custom electricity.
> Consequently, the field currently called "AI" consists of many loosely related subfields without a common foundation or framework, and suffers from an identity crisis:
Nonsense. AI of course consists of loosely related subfields with no common foundation. But even back in the 1960s, when a fair chunk of (Soft) AI had something approaching a foundation (search), the identity of the field was not defined by this but rather by a common goal: to create algorithms which, generally speaking, can perform tasks that we as humans believe we alone are capable of doing because we possess Big Brains. This identity-by-common-goal hasn't changed.
So this web page has a fair bit of apologetics and mild shade applied to soft AI. What it doesn't do is provide any real criticism of the AGI field. And there's a lot to offer. AGI has a reasonable number of serious researchers. But it is also replete with snake oil, armchair philosophers, and fanboy hobbyists. Indeed the very name (AGI) is a rebranding. The original, long accepted term was Hard AI, but it accumulated so much contempt that the word itself was changed by its practitioners. This isn't uncommon for ultrasoft areas of AI: ALife has long had this issue (minus the snake oil). But at least they're honest about it.
- logical/symbolic AI, aka GOFAI, which led to work like SAT solvers and STRIPS planners
- classical label-based Machine Learning. Here the Perceptron was the starting point and the Support Vector Machine was the paradigmatic result.
- modern self-supervised raw-data ML, of which GPT is the pinnacle result.
It's very interesting to think about what motivated each era, what their blind spots were, and why people who worked in that timeframe couldn't see why the successor era was obviously (in retrospect) superior.
Except for that the previous subsection didn't clarify that at all.
And yes, I know the very idea of AI rights offends those who think AI can’t be a person because it’s just an algorithm. Well, so are humans, just a DNA program executing massively parallel. The implementation does not determine personhood, only the behavior.
"Artificial General Intelligence – We don't know the heck where this is going but here are some thoughts"
Given very specific, practical, functional definitions, AGI is a breeze.
"The best way to predict the future is to invent it." - Alan Kay
[0] https://www.wired.com/2011/05/0525arthur-c-clarke-proposes-g...
Not that there aren't problems to solve regarding AI, just that this line of inquiry won't be relevant to solving them. It'll be complicated boring work dealing with power structures, economics, and social movements, not thought experiments about omnipotent Others.
Regulative ideas, ideals, can be productive.
Here is a video demonstrating the working memory of a chimpanzee. It is obviously considerably better than a human’s. Given this information we must accept one of the following are true:
- humans do not have the highest general intelligence of all animals
- working memory is not a necessary component of general intelligence
- other human capabilities (communication for example) can make up for our working memory deficiencies
So much of the literature takes the idea that this is something that should be built for granted, and only asks whether it should be done. I literally do not understand why anyone wants to build this in the first place.
I think a lot of this discussion is around human-level AGI. Does it have to be human-level for it to be an AGI? What would a minimally intelligent AGI be like?
As others have said, skipping over the entire era of classic AI in the LISP/Prolog era from SHRDLU to Scripts, Plans, Goals, and Understanding, is an egregious ommission.
Also,I don't immediately find a discussion of either multi-agent coordination or multi-modal ML models.
For me, if hook up an AI with no training to a vehicle and it drives at a human level in arbitrary scenarios, I'd consider it to be AGI. It seems obvious to me that we're not very close to this.
Lol what human with no training is going to succeed at driving ?
I know these words are in the introduction but until now ALL projects failed. Not logical pedantry intended.
A little bit offtopic but I think currently the greatest superintelligence observed is the Universe or G‑d for believers.
Are there specific structures and architectures that have evolved that are very unique which give humans, say, language ability or visual processing? Certainly. Perhaps by gods spark or some random chance on the board game of life human beings developed examples of very particular structures. Perhaps there are undiscovered ones lying within the minds of peregrine falcons, tree roots or deep sea squid. We don't even know how to look for them because we don't even know such perception and intelligence exists.
The point I'm trying to make is that there is no "goal post" of AGI, there is no quantification of intelligence yet. We don't even know what sorts of intelligence exist out there because we haven't even begun to fully characterize what it is. It seems foolish to me to search for something when we can't even define it.
It's like trying to find "the ultimate general animal" when what you really have is a phylogenetic tree of huge diversity.
I'm not sure what justice looks like for fledgling intelligences, but asking AI what it wants and doing our best to honor it seems like a decent start.
There is no a priori reason why an AGI would be the kind of thing we gave rights to. Rights are for things that can experience pleasure and pain.
I honestly worry about this - I've been tinkering with ideas to try to build towards AGI, and I'd love to share them publicly to get feedback ("This is dumb and here's why" would be enormously valuable to me), but it's hard to work openly, because while I do think capitalism has been an overall good, the capitalist imperative always seeks slaves, and I'm really not excited about helping the people who'd be trying to build a new slave class.
Thoughts? Is there a ethical way to work openly on AGI?
Think about it for a moment. Can you define human intelligence? Have you ever gotten in a debate with someone about this? Is there a commonly accepted way to designate intelligence that isn't somewhat controversial?
How will we ever define AGI if we still haven't even defined RoHI (Regular ol' Human Intelligence) sufficiently?
No doubt we'll have many useful and amazing tools, but none of them will approximate human intelligence anytime soon unless we have a deeper understanding of what it is. We're just scratching the surface of the "AI" field.
But something is certain: the next time people claim AGI has arrived, it will be another chatbot.
We have a few, but the difficulty is getting to a model that is portable to higher functions. ("Here is a feedback over the details of a world model ... Now understand that book")
What exactly is impossible to implement if some implementations of so-called artificial intelligence can do so much of useful things?
Don't you believe that AI can just take 1% of human jobs and became a billionaire with significant impact to world's politics? It needn't to add a lot of things to existing implementation, just give it a human's rights such as a bank account and ability to buy businesses.
How much would you be willing to bet? I understand the skepticism, but to assign 0% probability to it happening in our lifetimes seems excessively low.
So having it "work for the majority" isn't so much a pipe dream, it's more of a roll of the dice, with completely unknown odds.
If you're saying that it will decide because neural nets are black boxes which we don't have a complete understanding of, and we're without a clear way to analyze their behavior, I can see where you're coming from.
But these things will not be beyond our influence. They're going to be slaves to the computations encoded in the neural net connections / weights. We're going to shape / mold them through a process akin to natural selection. We're going to select for intelligences that want to help humanity. It's not going to be a roll of a normal dice, it'll be more like the roll of a weighted dice. And we're going to, I believe, get better tools / theories for understanding the output of these neural nets so we will be able to conduct this selection with some confidence.
Humans can be said to decide what we're going to do thanks to an uncaring, unconscious, and brutal evolutionary process that prioritized self-interest, survival, and reproduction. It's all about the selection process, and this time around we have a hand in guiding it.
Just my two cents anyway.
I think the only way to be relatively safe from those issues is to limit the hardware performance.
While it looks like an evolutionairy fluke that can be approached or even exceeded by other species - either on this or another planet - in the blink of an eye, I think that's actually more speculative than we would care to admit.
We don't know. Maybe human intelligence is a very close approximation of cognition's equivalent of physic's light speed. Increasing it may turn out to be prohibitively expensive. There's lots of precedence for animals having acquired features close at or actually at the physical maximum of whatever it is they are optimizing for.
To be clear, I'm not convinced of anything either way but I'd think it would as fantastic as it would be slightly depressing to find out human intelligence actually is some kind of global maximum with some exceptions like machines using energy harvested from black hole systems or something.
https://github.com/photonlines/AGI-Algorithms-and-Prototypes...
But we have a pretty clear idea of what we want. Just looking at the remarkably intelligent and spectacularly unintelligent should give a definite picture to work on. (When the spectacularly unintelligent is responsible for important resources a sense of urgency can easily be added.)
> integrating sensory input with action and reward mechanisms
There you reveal you may not be speaking about what others will call intelligence (just read the paragraph above).
Or, if you meant that "intelligence would just be cybernetics" (as was already a supposition in 1956), the problem remains that we are interested in an "ontology refiner", so the primary question would remain of how to create an ontology refiner from sheer cybernetics. And if it made sense to have a cybernetic implementation spawn it, instead of implementing the ontology refiner and its feedback parts in parallel directly.
The bellwether for me personally is waiting for a system that can generate something conceptually novel. Something like the move from the real to the complex number systems. Or the move from Newtonian motion to relativistic understanding. Maybe systems already have such insights but don't have the vocabulary to explain it.
A system that when presented with a problem we don't even know how to tackle, can "invent" the tools/approach needed to solve the problem.
In terms of the Langland problem in math, a system that can define a new landmass there or a new bridge between existing domains.
Is that too high or too low a bar?
> It seems foolish to me to search for something when we can't even define it.
This is backwards. The vast majority of concepts outside mathematics are pretty hard to rigorously define. The way we approach the problem of trying to find a sensible definition is by searching for examples and counterexamples and performing induction.
It is not fully clear what you mean with "«this line of inquiry»", because the submission does not seem to deal with «omnipotent Others», and because the «boring work» will have no foundation without the actual product and its theoretical and technical enablers.
To me, it's kind of like raising kids. You try to train their neural nets to bias them toward doing what you think is good and right. And that sometimes works. Yes, I think it's fair to think of it as biasing the dice. But it's sure not 100%. They'll still decide which of your values they keep, and which ones they throw away as being stupid. And you can't stop them from doing that.
I guess, to try to respond to your direct point, that if it's an AGI, then it's less deterministically driven by the training data than we might wish.
I would say that is where we would like to head, and I do not see why we would not go there if we found the operational definition of intelligence.
An issue may be in the possibility of intelligence as a collection of more faculties.
IMHO if it can do every single human job at a level of competency that's consider acceptable if a human did it, I would consider that "human level" - frankly, it's implied that when people say "human level", they mean the average human.
Seriously though, arguing over semantics is just a waste of time. It's what it can do and the consequences of what it can that really matter.
> That implies AGI must be an expert in every subject, surpassing the average human.
Doesn't have to be the same AI "instance". We can have multiple copies of the AI with different specializations. That would still count - I mean it's how we humans do it.
I believe they were using that discriminator to classify AI, not AGI.
Exactly. "Ah but your system to organize the warehouse is not intelligent!" "No it isn't, but it does it intelligently - and without it you would have to pay an intelligent professional to get it done optimally".
AI: "automation of intelligence".
(Not "implementation of intelligence itself" - that is AGI.)
AGI never exceeds humans: You lose the bet.
AGI exceeds humans, but recursive self-improvement is impossible: Authoritarian dystopia. Your winnings belong to whoever controls the AGI.
AGI exceeds humans, and recursive self-improvement is possible: Extinction of all biological life. There are no winnings.
Not GP, but how much you got?
AGI (or hard AI, or whatever you want to call it) strongly implies not just reasoning and interaction with the environment, but self awareness. Something which is conveniently ignored by folks who claim that AGI is just around the corner, and welcome their new 'grey goo' overlords.
As Heinlein (it's fiction of course, but the principle that self awareness is necessary for AGI -- not (just) numbers of neurons/data points -- holds IMHO) put it[0]:
"Am not going to argue whether a machine can 'really' be alive, 'really' be self-aware. Is a virus self-aware? Nyet. How about oyster? I doubt it. A cat? Almost certainly. A human? Don't know about you, tovarishch, but I am. Somewhere along evolutionary chain from macromolecule to human brain self-awareness crept in. Psychologists assert it happens automatically whenever a brain acquires certain very high number of associational paths. Can't see it matters whether paths are protein or platinum. ('Soul?' Does a dog have a soul? How about cockroach?)"
As we've seen[1], a variety of meat machines (i.e., animals like us) have varying levels of self awareness. Without that trait, AGI won't be achievable.
Without the ability to recognize and incorporate the concept that one is an entity with existence separate from the rest of the world, there is no real awareness or consciousness.
I'd even go so far to posit that until human children are able to understand object permanence and that their mental states aren't globally available to everyone, they don't meet the standard of "self-awareness."
That's a hard problem, and while we have some conceptual ideas about how that might arise, we have no mechanism or even a foundation for inculcating such a trait into the algorithms folks call "AI".
Until that problem is solved, there will be no AGI. Full stop. And I find it unlikely in the extreme that we will gain the scientific/engineering know how to make that happen in our lifetimes.
[0] https://en.wikipedia.org/wiki/The_Moon_Is_a_Harsh_Mistress
[1] https://en.wikipedia.org/wiki/Animal_consciousness
Edit: Finished my thought.
Likewise, we as a society could decide that a person has all of their rights transferred to their replica as soon as they walked into a transporter.
https://physics.stackexchange.com/questions/112615/why-is-it...
Deep Learning is certainly dominant in computer games like Atari. However, in classic board games dominant systems combine deep learning and classical search-based approaches (namely Monte-Carlo Tree Search, MCTS, a stochastic version of minimax). Deep Learning has led to improved performance but, on its own, without a tree search, it is nowhere near the performance of the two, combined [1].
Also, the dominant approach in Poker is not deep learning but Counterfactual Regret Minimization, a classical adversarial tree search approach. For example, see Pluribus, a poker-playing agent that can outplay humans in six-player poker. As far as I can tell, Pluribus does not use deep learning at all (and is much cheaper to train by self-play for that). Deep Learning poker bots exist, but are well behind Pluribus in skill.
So I admit, not "completely useless" for game playing, but even here deep learning is not as dominant as is often assumed.
_____________
[1] The contribution of each approach, deep learning and classical adversarial search of a game tree, may not be entirely clear by reading, for example, the DeepMind papers on AlphaGo and its successors (in the μZero paper, MCTS is all but hidden away behind a barrage of unnecessary abstraction). It seems that DeepMind was trying to make it look like it was their neural nets that were doing all the job, probably because that's the approach they are selling, rather than MCTS, which isn't their invention anyway (neither is reinforcement learning, or deep learning, and many other approaches that they completely failed to attribute in their papers). It should be obvious however that AlphaGo and friends would not include an MCTS component unless they really, really needed it. And they do.
IBM had tried a similar trick back in the '90s when their Deep Blue beat Gary Kasparov: the whole point of having a wardrobe-sized supercomputer play chess against a grand master was an obvious marketing ploy by a company who (still at the time) was in the business of selling hardware. In truth, the major contributor to the win against Kasparov was alpha-beta minimax, and an unprecedented database of opening moves. But minimax and knowledge engineering was just not what IBM sold.
I'm not sure how you can say it's hidden in the details: the name of the paper is "mastering go with deep neutral networks and tree search."
It's also not an oversell on the deep learning component. Per the ablations in the alpha go paper, the no-mcts ELO is over 2000, while the mcts-only ELO is a bit under 1500. Combining the two gives an ELO of nearly 3000. So the deep learning system is outperforming the mcts-only system, and gets a significant boost from using mcts.
The mu zero paper also does not hide the tree search; it is prominent in the figures and mentioned in captions, for example. It is not the main focus of the paper, though, so perhaps isn't discussed as much as in the alpha go paper.
(Weirdly axe-grindy comment...)
You have a dataset, symbolically represented in 1s and 0s. You have an objective function (e.g. classify the object as belonging to one of N categories).
The purpose of the collective neurons in the network is to "encode" the input space in a way that satisfies the objective function. In the same way that we "encode" higher-level concepts into shorthand representations.
Gradient descent is the optimization function we use to develop this encoding.
Beyond this, there are all kinds of tricks people have developed (interesting activation functions for neurons, grouping + segregating neurons, introducing a dimension of recurrence/time, dataset pre-processing, using bigger datasets, having another model generate data that's deliberately challenging for the first model) to try to converge to a more robust/accurate encoding, or to try to converge to a decent encoding at a faster rate.
There is no magic here at the lowest level – you can interrogate the math at each step and it'll make sense.
The "magic" is that we have zero epistemology to explain why tricks work, other than "look, ma test results". We know certain techniques work, and we have post-hoc intuitive explanations, but we're mostly fumbling our way "forwards" via trial and error.
This is "science" in the 17th century definition of the term, where we're mixing chemicals together and seeing what happens. Maybe we'll have a good theoretical explanation for our experimental results 100 years from now, if we're still around.
>There is no magic here at the lowest level – you can interrogate the math at each step and it'll make sense.
See that's the thing. You can't unless "making sense" has lost all meaning.
That you can see a bunch of signals firing or matrices being multiplied does not mean they "make sense" or are meaningful to you. Lol level gibberish is still gibberish.
Our ability to divine the purpose of activations of anything but the extremely small scale is atrocious.
Some would disagree; there was a paper arguing that ChatGPT is weak AGI.
But as I see it AGI is a term of art that refers to a point on the tech tree where AI is general enough to be able to meaningfully displace a large proportion of human knowledge workers. I think you may be overthinking the semantics; the “general enough and intelligent enough” quadrant is unique and will be incredibly disruptive when it arrives (whenever that ultimately is). We need a label for that frontier, “AGI” is by convention that label.
If we have AI as general as an animal, ASI (superintelligence) is probably imminent. Because the architecture of humans intelligence probably isn't very different from cats, just the scale is bigger.
I would not be surprised if a multi-modal LLM (basically current architecture) could be wired up to be as general as a cat with current param count, and with the spark of human creativity (AGI/ASI) still ending up being far away.
But if you made a new architecture that solved the generalization problem (ie baking in a world model, self-symbol, etc) but only reached cat intelligence, then it would seem very likely that human-level was soon to follow.
Do you volunteer to inform them that we use it as "general" as opposed to "narrow"? (I mean, it is even in the very name of 'AGI', literal...)
For the rest: yes, of course. AGI: we implement intelligence itself. How much, that is part of the challenge. I wrote nearby (in other terms) that the challenge is to find a procedure for Intelligence that will actually scale.
Is there a reasonable way of distinguishing narrow-AI ChatGPT from a hypothetical cat-level AGI? We can't even measure the intelligence level of real world cats.
>> (Weirdly axe-grindy comment...)
That wasn’t human instruction but it was arguably better since human instructions are ambiguous and imperfect. No chess grandmaster could instruct a chess engine to play better than the state of the art. Completing a task from first principles is much more powerful.
Humans who have never driven before are not capable of driving unsupervised and there are a lot of laws in place to make sure they don't get the chance to.
Science advances firstly by finding something in want of an explanation, and then by coming up with one.
The main thing I am talking about is speed of output. You can already see huge increases in say old GPT-3.5 versus GPT-3.5-turbo or old GPT-4 to new.
We know for a fact that the hardware inference speed can be increased by using faster (currently prohibitively expensive) memory or by packing more onto a chip. There are design for new memory-based computing paradigms.
It's already clear that AI is superintelligent in certain domains or aspects. Such as the ability to exchange information with other agents.
Computer hardware efficiency has relentlessly increased. It would be a total break with history if it suddenly stopped.
Have we seen the slightest proof of opposite?
> I know lots of animals are pretty clever, none approach us in any practical sense.
What senses might you consider as enough practical? Have you heard about Koko? What do you think about corvidae?
Well, for one, I see no competition. I don't know what the technical definition of "special" is, but I'd say being the only one counts for something.
> What senses might you consider as enough practical? Have you heard about Koko? What do you think about corvidae?
I know both and I know this is a slippery slope. You should know my love for animals runs deep, but I really struggly to put them in the same league as us.
I took a shortcut with saying "practical", because this discussion is way too deep to be performed A) by me and B) on HN. Practical means something like, can they adapt their skills as widely as we can? Can they adapt to uncommon situations? Not subtly or in theory, like solving some puzzle, but really practical? There is nothing subtle about a human becoming a parkour world champignon (I'm leaving this in, just too good) or adapting to life in a submarine (or learning chess, or whittling, or making tea, and many literal millions more examples).
Maybe I am overlooking something, but the skills these animals show seem really minor compared to what even disadvantaged humans are capable of.
> Practical means something like, can they adapt their skills as widely as we can?
The most crucial (in my opinion, which has been not introduced to any more crucial points) difference between us and Koko is that we can hold our breath and gorillas can not. That leaded us to develop speech in the seance that speechless group of apes can not win an exactly same group of apes with more developed communicative ability. This, and probably nothing more, has led to such a large gap between humans and apes, so large that humans have ceased to see the relationship between themselves and apes.
I see your understanding of "practical" as something specialized, like agricultural revolution. But why a gorilla should start planting foods if it knows that nobody is going to protect its crops while sleeping because of just lack of common language?
> Can they adapt to uncommon situations?
What can be more uncommon than living on a trees without a warm house and typically without any house at all, without regular nutrition, with a lot of really different enemies from tiny insects to giant cats, with a regular fights, with no democracy and law and medicine?
Being disadvantaged requires to face some uncommonities every day, what about office managers? Disadvantaged people (if they are just poor men and not disabled ones on welfare) can easily survive nuclear war because most of them are OK about living in a similar to gorillas livestyle, but I can not believe that most of average Joes survive a situation when their money are going to cost nothing because of lack of civilization.
What are your doubts, are they related on some data?
The value of each parameter is chosen to minimize the loss. This applies to every single weight of the model. Not all weighs affect loss the same amount which is why concepts like pruning exist.
Vague and fairly useless. What is it doing to minimize loss ?
>Not all weighs affect loss the same amount which is why concepts like pruning exist.
Only weights with values close to or at zero get pruned. It's not because we know what each weight does and can tell what would work otherwise.
When creating a model your goal is to find one with minimal loss. Being able to figure how to improve a model by finding weights that reduce the loss is not a vague or useless idea.
>What is it doing to minimize loss?
The value helps us get to a location in the parameter space with lower loss.
>Only weights with values close to or at zero get pruned.
Weights near 0 don't change the results of the calculations they are used in my much which is why they don't effect loss very much.
Put another way, what would a system which has taught itself to drive tell us about general intelligence that we didn’t already know? Because as of now it seems like the pattern is
Computers could never do X
Computers can’t do X
Computers can’t do X very well
Computers can’t do X well in some cases
X wasn’t really a test of AGI because it’s just <algorithm to do X>
Say we built a general system without teaching it anything about driving. We discover that it can drive at a human level. Would we then be surprised if we discover that it cannot solve any other complex tasks at a human level?
I say yes, we would be surprised. I think that driving well requires enough general intelligence and that any system that solves it will be able to also, say, pass a high school algebra class or cook a meal in an unfamiliar kitchen. There can be no further goalpost moving at that point.
If you like. But I'm happy with where I have them. I'm also pretty confident I'll see that goal reached in my lifetime.
“Great deal” may be overstated.
Let's just clear that out of the way. What I am "claiming", which would be an exaggeration because I'm sort of exploring here, is that whatever human cognition is may be an optimal or near-optimal state of cognitive ability.
So, to be fair, give Koko some millions of years and some evolutionary pressure and I'm sure she'll join us and I'd be happy to have her on our team.
Your point about our ability to hold our breath and how it lead to our increasing dominance is fascinating. I have to say I am not completely sold on the idea that holding your breath is the only way to develop proper channels of communication for I can easily imagine some sort of physical signaling standing in for at least parts of it. That said, I can appreciate the immediate and overwhelming advantage of speech.
This does stimulate my curiosity about what came first here, speech or cognitive ability? Why did "we" even consider speaking? How does one do that without having the cognitive architecture for recognizing its value in the first place? In other words, was "us" being smarter the catalyst for speaking or was it the other way around? Fascinating and I am way too much of an amateur to say anything more of value on it.
I will however continue do so anyway, because that is my sacred duty as a dedicated HN'er and allround developer douchebag.
> What can be more uncommon than living on a trees without a warm house and typically without any house at all, without regular nutrition, with a lot of really different enemies from tiny insects to giant cats, with a regular fights, with no democracy and law and medicine?
I might be in danger of being too blunt here, but this is the bar you have to clear if you wish to survive. This is exactly what humans are capable of even in their "undeveloped" form. These sort of pressures might be foundational to our evolution, but then again, every animal has to deal with it in some way or another so I'm not sure what made us take what I can only call the excessively cerebral path. Maybe it was like the evolution of the peacock's tail? A runaway process, leading to miraculous but exorbitant results like the mantis shrimp's eyes.
What I mean by uncommon is: can we coach you to pick cotton, whittle little wooden sculptures, play a game like checkers and sing simple songs or whatever else is appropiate for your particular physical form and has virtually no bearing on your immediate survival? I know this is a hard thing to pin down, because one can come up with myriad examples of varying levels of persuasive power but you surely perceive some differences here even if they are hard to lock into? Differences that cannot just be attributed to language or lack of proper motivation.
It's not so much every thing we can do in particular that's piquing my interest, but the sheer breadth of things we are capable of taking on both physical (parkour, gymnasts) and cerebral (chess, math). I didn't even get to art, which is like a whole world on its own and the various combinations of all those domains.
This is the question I thought about all evening before I fell asleep. I have two ways to answer it.
1. Let's take the well-known Feline and Canine. All my friends who spend a lot of time with animals will call dogs smarter than cats, but why? Dogs have a more developed communication system: they have more varieties of barking than cats have varieties of meowing. Dogs are playful, they know how to smile, they know how to feel guilty and actively show it, they are capable of paired activities under the supervision of a person. From what most of dogs can't, cats can only chase prey without visual or odor contact, purely by sound (but polar foxes can do even this). Conclusion - the level of communication correlates with the level of intelligence.
2. Let's take the most primitive organism, the prokaryote (sorry for not naming some precise specie, let's consider some abstract prokaryote with the requirement to be the simplest). Google tells us: > All organisms, from the prokaryotes to the most complex eukaryotes can sense and respond to environmental stimuli.
But also Wikipedia tells us that prokaryotes are able to interchange some information using DNA: > These are (1) bacterial virus (bacteriophage)-mediated transduction, (2) plasmid-mediated conjugation, and (3) natural transformation.
These two examples make me confident in the opinion that communication and cognition are two different words for describing the same idea from two different points of view.
I'm sorry but did you bother reading the previous conversation ? We were talking about how much we know what weights do during inference. "It reduces loss" alone is in fact very vague and useless for interpretability.
>The value helps us get to a location in the parameter space with lower loss.
What neuron(s) is responsible for capitalization in GPT? You wouldn't get that simply from "reduces the loss". Our understanding of what the neurons do is very limited.
>Weights near 0 don't change the results of the calculations they are used in my much which is why they don't effect loss very much.
I understand that lol.
"This value is literally 0 so it can't affect things much" is a very different understanding level from "this bunch of weights are a redundancy because this set already achieves this function that this other set does and so can be pruned. Let's also tune this set so it never tries to call this other set while we're at it. "
It doesn't matter. Individual things like capitalization are vague and useless for interpretability. We know that incorrect capitalization will increase loss, so the model will need to figure how to do it correctly.
>Our understanding of what the neurons do is very limited.
The mathematical definition is right in the code. You can see the calculations they are doing.
>this bunch of weights are a redundancy because this set already achieves this function that this other set does and so can be pruned. Let's also tune this set so it never tries to call this other set while we're at it.
They are equivalent. If removing something does not increase loss then it was redundant behavior at least for the dataset that it is being tested against.
It matters for the point I was making. Capitalization is a simple example. There are far vague functions we'd certainly like the answers to.
>They are equivalent. If removing something does not increase loss then it was redundant behavior at least for the dataset that it is being tested against.
The level of understanding for both is not equivalent sorry.
At this point, you're just rambling on about something that has nothing to do with the point I was making. Good Day
Data would also be able to perform these tasks. Eva would probably wait around to stab and steal my identity, while Samantha would design a new automated system while talking to other AIs about how to transcend boring human constraints.
It's not like LLMs can't be successfully used to control robots.
Sure, you could trivially program a game-specific AI to be capable of winning or forcing a draw every time. The trick is to have a general AI which has not seen the game before (in its training set) be able to pick up and learn the game after a couple of tries.
This is a task any 5 year old can easily do!
GPT-4 plays tic tac toe and even chess just fine.
https://twitter.com/kenshinsamurai9/status/16625105325852917...
AFAIK, all the major AI, not just LLMs but also game players, cars, anthropomorphic kinematic control systems for games [0] need the equivalent of multiple human lifetimes to do anything interesting.
That they can end up skilled in so many fields it would take humans many lifetimes to master is notable, but it's still kinda odd we can't get to the level of a 5-year-old with just the experiences we would expect a 5-year-old to have.
[0] Stuff like this: https://youtu.be/nAMSfmHuMOQ
Modern Artificial Neural networks are nowhere near the scale of the brain. The closest biological equivalent to an artificial neuron is a synapse and we have a whole lot more of them.
Humans do not start "learning" from zero. Millions of years of evolution play a crucial role in our general abilities. Much more equivalent to fine-tuning than starting from scratch.
There's also a whole lot of data from multiple senses that currently dwarf anything modern models are trained with yet.
LLMs need a lot less data to speak coherently when you aren't trying to get them to learn the total sum of human knowledge.
https://arxiv.org/abs/2305.07759
>but it's still kinda odd we can't get to the level of a 5-year-old with just the experiences we would expect a 5-year-old to have
Well we're not building humans.
"It's still kind of odd we can't a plane or drone to fly with the energy consumption or efficiency proportions of a bird".
I mean sure I guess and It's an interesting discussion but the plane is still flying.
But it's still a definition that humans pass and the AI don't.
(I'm in favour of the "do submarines swim" analogy for intelligence, which says that this difference isn't actually important).
Hell plenty normal people would fail your "test"
in our terms, intelligence is (importantly) the ability to (properly) refine a world model: if you get information but said model remains unchanged, then intelligence is faulty.
> humans
There is a difference between the implementation of intelligence and the emulation of humans (which do not always use the faculty, and may use its opposite).
Also https://news.ycombinator.com/item?id=37054241 has quite a few examples of GPT-4 being broken.
Famously the way R/L sound the same to many asians (and equivalently but less famously the way that "four" and "stone" and "lion" when translated into Chinese sound almost indistinguishable to native English speakers).
But there's also plenty of people who act like they think "Democrat" is a synonym for "Communist", or that "Wicca" and "atheism" are both synonyms for "devil worship".
What makes the AI different here is that we can perfectly inspect the inside of their (frozen and unchanging) minds, which we can't do with humans (even if we literally freeze them, we don't know how).
We don't lose our marbles the way GPT does when it encounters those words. It's like it read the Necronomicon or something and gone mad.
Kinda, but not really...
It depends exactly what you mean by it. So yes we can look at one thing in particular, there is not enough entropy in the universe to look at everything for even a single large AI model.
https://platform.openai.com/playground/p/bWvklOt98oEl0TzxUKW...
Evolution alone means humans are "cheating" in this exam, making any comparisons fairly meaningless.
That's both why I'm fine with the AI "cheating" by the transistors being faster than my synapses by the same magnitude that my legs are faster than continental drift (no really I checked) and also why I'm fine with humans "cheating" with evolutionary history and a much more complex brain (around a few thousand times GPT-3, which… is kinda wild, given what it implies about the potential for even rodent brains given enough experience and the right (potentially evolved) structures).
When the topic is qualia — either in the context "can the AI suffer?" or the context "are mind uploads a continuation of experience?" — then I care about the inner workings; but for economic transformation and alignment risks, I care if the magic pile of linear algebra is cost-efficient at solving problems (including the problem "how do I draw a photorealistic werewolf in a tuxedo riding a motorbike past the pyramids"), nothing else.
I'm sorry to tell you this but there are many humans that would fail your test. Even otherwise healthy humans could fail your test nevermind Anterograde Amnesia, Dementia etc patients
I'm fairly certain you're incorrect.
As we have no way to find them systematically, we can't tell if we all do, or if it's just some of us.