DeepMind’s new AI with a memory outperforms algorithms 25 times its size(singularityhub.com) |
DeepMind’s new AI with a memory outperforms algorithms 25 times its size(singularityhub.com) |
Exciting times.
If you don't update your database and indices they are great. But that's something really tempting to do when you do some machine learning, (specially if you know that people with deeper pockets will do so).
Typically you will have a neural network, you run it on your dataset, it produces a new dataset of embeddings, you index them, and you use this index to train a new neural network, and you repeat the loop, hopefully improving results along the way.
NVMe SSD can write at 6GB/s but can only write ~800TB that's about 37 hours of lifetime at max speed.
"Only" 825 GB actually: https://pile.eleuther.ai/
A not-insignificant fraction of that is definitively copyrighted material, though, which raises some interesting questions when switching to a model of distributing "a smaller trained model plus the original raw training data" (though it seems that the team behind GPT-J are clearly happy to distribute their full set of data anyway, and seem to be enough under the radar to not attract the wrong sort of attention,at least for now).
48GB VRAM? 48+ gigabytes of system ram is cheap, 48 gigabytes of ram on a GPU is still painfully expensive.
Still superb, though, there's no reason you can't use other gofai tools vs a static database, to trigger expert systems or formalized reasoning.
Like, could we assume that for humans it's also faster to search for information on Wikipedia or would it be faster to recall from memory of already read Wikipedia? Although with humans stored information decay is present. (In a way a human form of garbage collection :P).
(This is a different blogpost, but does not seem to add over the original)
Edit: following derac's comment see https://news.ycombinator.com/item?id=29646112 for RETRO
[0] https://deepmind.com/research/publications/2021/improving-la...
Github Copilot is definitely GPT-3-based and is seeing real-world use https://copilot.github.com
Transformers are state of the art for many tasks so they are likely to be used for "intelligent" processing of text or speech data, but due to practical limitations you are probably interacting with them mostly through web services.
If anything, it's being used in force for social media marketing, where you're trying to say "buy this thing" in different ways every day.
Written Scots In the written mode, Scots spelling remains variable. Attempts to make it more consistent, notably the Scots Style Sheet produced by the Makars’ Club in 1947 or the Recommendations for Writers in Scots published by the Scots Language Society in 1985, have had at best only limited success, competing with other systems that have been developed to represent more closely localized varieties of spoken Scots.
When your reference text says the language isn't yet well captured in a single print, you better believe the wiki page is a hot mess.
Instead of training GPT-3 with 178B weights, you train a 25x smaller model and allow it to retrieve useful snippets from a large text index as additional information.
This solves the problem of very large models and the problem of updating an already trained model, as you can swap the text corpus with a newer one. The model learns mostly syntax, burning less trivia in its weights than a regular LM as it can simply copy the relevant information from the index.
This development was bound to happen as large LMs are expensive to use and it was an obvious idea. We've had these semantic search text indices for a few years already[1], they just weren't combined with text generation.
Surely this is a function of location? I understand the U.S.-English term “person o color” to be convoluted language for “not white”. One simple thing I notice is that if I search for, say, “child” on Google Image Search, the images indeed tend to look as what one would expect from the average inhabitant of an English-speaking nation, when I search “子供”, I indeed mostly see what I would expect from Japan. Similarly, if I search for “house”, what I find tends to look like a house most likely situated in the Netherlands; with “บ้าน”, it does resemble more so stereotypical Thai architecture.
I would assume that a.i.'s made in, say, Japan would yield different results.
As humans age we apparently lose the former but compensate with the latter as best we can.
That is clearly not possible, so it can't be what they are doing.
Rather than diffusely encoding that knowledge in a massive number of self-organized layers of weights, it is explicitly encoded. The remaining network can "focus" on mapping input to retrieve the relevant information stored in that database, and extracting/interpolating/extrapolating that information based on the current context to generate useful output.
The idea that AI itself can be biased (as opposed to the dataset) also has some significant problems. The lead of Facebook AI Research got canceled on Twitter because he pointed out that it's the bias in the dataset used to train the AI that results in bias in the AI and not the AI itself that's biased. I'd also question whether Gebru is a "widely respected leader in AI ethics research". Model interpretability is not even close to a solved problem so just because you can demonstrate some correlation between images of black people and worse performance does not imply that "black person" is a causative factor. It could literally be dataset distribution or image contrast or any number of other plausible explanations that are easily fixable by an ML engineer.
Claiming that "the AI is not biased, the training set is" is like saying "this running program isn't buggy, it's just the source code that is buggy".
And normally, that's harmless - as you said, you'd expect to see an AI finding pictures of houses in the region/culture you are searching it. But in a multi-cultural/multi-ethnic society, searching for "people" and showing up only what is considered the "majority" has a whole different lot of ethical implications.
Identifying and ideally remediating such issues is why ethics research is so sorely needed.
I am not actually; I am searching for “huis”, not “Nederlands huis”; I'd expect the result I obtain from the former with the latter.
I'd actually expect “house” and “huis” to reveal similar results from a good search engine. Obviously this is not easily possible with how it is trained with corpora in a specific language, but from usability I think this is undesirable, if I specifically want Dutch houses I can always add that term as a specification; there is no way to simply search for houses, wherever they might be, in Dutch, or English, or Thai, or any other language.
That is to say, I'm not arguing that there is no problem; I'm arguing that the problem is highly dependent upon location, and that he article should not take such a U.S.A.-centric stance and act as though the reset of the world not exist.
This is probably why there is more variance when searching for English terms as wel, as a Lingua Franca. If I search “house” I do see some styles of architecture not commonly found in Anglo-Saxon nations, whereas all occurrences of “huis” do seem to be situated in the Netherlands.
I can only quote Joy Buolamwini on this:
“To fail on one in three, in a commercial system, on something that’s been reduced to a binary classification task, you have to ask, would that have been permitted if those failure rates were in a different subgroup?”
Come on, if you've worked at any large company using ML you know model performance is literally just taking the average accuracy/ROC/precision/etc over your training dataset plus some hold out sets. Then you track proxy metrics like engagement to see if your model actually works in production. At no point does race come into the equation. Naturally, if your choice of subgroup happens to not be a large proportion of either the dataset or the userbase then you don't see the poor performance on that subgroup show up in your metrics so you don't care to fix it.
For many tasks it wouldn't be helpful because the input is small enough to be covered by the context already, and for summarizing and question answering tasks, you want it to repeat information from other documents, but not from earlier in its own output.
It might be interesting for a long-context task like "given the first parts of this book, complete the next chapter".
Ethics education is (unfortunately) not really seen as necessary across the tech field, which is why ethics researchers need to be part of at all stages of AI development.
And for what it's worth, ethics researchers should be part of all technology development - the "racist soap dispenser" should have been more than enough proof of how even a very simple, innocent product can contribute to ethnic discrimination.
I definitely cranked out the newspaper text much faster than I would have on my own, and the model actually made some really nice embellishments and added a couple ideas that I kept in the final text.
You can access GPT-3 directly now. There's no waitlist, but there still are restrictions. There's some examples here: https://beta.openai.com/examples
You don't even need the API. Once you get access, it comes with access to the playground, which is enough to do anything you like.
If you look at the examples, it's very "no code". You literally tell the AI what you're trying to do and it tries its best. Most of the work in prompt engineering is writing something that can't be misunderstood. But you just have to explain to it what you want like you would to a child.
GitHub copilot uses Codex, a descendent of GPT-3.
Different regions, yes - but where did the training and benchmark datasets come from? AI research is surprisingly monocultural (or use "standardized benchmarks" if you're feeling charitable). Not too long ago, there was a paper posted on HN that showed that a bunch of the datasets contain mislabeled data, which means a lot of "different" models are encoding similar biases.
For formatting, I copy from our spreadsheet instruction, and copy what the "markdown" format looks like.
Convert the following text to markdown format.
Text:
1. Open the app
2. Enter your USERNAME and PASSWORD
3. Go to the menu and click "Foobar"
4. Enter the following into the field [CORRECT INPUT HERE]
```
Markdown:
# How to Use Foobar
## Subtitle
- Open the app
- Enter your USERNAME and PASSWORD
- Select the menu and click <b>Foobar</b>
- Enter the following into the field <b><span style="color:green;">{correctInput}</span></b>
```
Text:
(your input here)
Yes, I know it's not really markdown. But you give it an example of input and output and it will figure it out as easily as a human can.The problem we faced was that it was meant to be markdown but it's not easy to write the parser, so we had a different format of "markdown" on different front ends. You'd have something a little different on Android, iOS, HTML.
But the beautiful thing is we can keep the same input and change the output to whatever the parser needs. And instead of having to write up regex that detects [CAPITAL LETTER INPUT] and converts that, the AI can just recognize it.
First you have a continuum of language dialects, then one of them dominates for political reasons, then it gets codified, then enforced onto everybody through centralised education. Dialects not under direct unified political control become related but separate languages... And so on.
Other issues which are sure to arise is that the a.i. will have trouble with people who aren't smiling, and that the data set probably contains people who look better than average, and almost certainly excludes people who suffer from injuries or deformities in appropriate proportions.
Perhaps an interesting project is simply the compilation of a vast dataset of “world proportional pictures of people”. — It would be an interesting undertaking to realize such a dataset.
My personal take is you won't see any tangible movement on this until black women (or whatever group you choose) comprise a tangible proportion of revenue generating users. Corporations operate for money and nothing else.
Black women or other groups not viewed as the mainstream target for an AI solution aren't going to form a tangible proportion of revenue generating users if the software doesn't function properly for them. And a lot of the use cases for AI analysis don't involve the unrepresented-in-corpus minority group being the consumer anyway, they involve it being used to screen them by a third party who's been sold the tool on the false premise that it's free from human bias.
Yeah... i think i already agree with the person disagreeing with you just based on this description. "world class abacus" is not an accurate description of what a quantum computer is.
I’m also not a neuroscientist or a physicist though, so this is just my relative layman’s take.
0: https://www.the-scientist.com/infographics/infographic--quan...
Quantum Biology: The Hidden Nature of Nature - https://www.youtube.com/watch?v=ADiql3FG5is
An Introduction to Quantum Biology - with Philip Ball - https://www.youtube.com/watch?v=bLeEsYDlXJk
As an amateur, my instinct is that the mystery of the observer "causing" wave function collapse is the best clue we have either way.
I wouldn’t be surprised if we are just weighted neurons, or if we do something quantum too. I doubt the something quantum will be the same as what our quantum computers do.
Source https://www.physicsforums.com/threads/does-consciousness-cau...
Quantum Biology: The Hidden Nature of Nature - https://www.youtube.com/watch?v=ADiql3FG5is
An Introduction to Quantum Biology - with Philip Ball - https://www.youtube.com/watch?v=bLeEsYDlXJk
I realize this isn't quite the answer you were looking for, but I thought it was worth mentioning.
I personally think this idea that there is some obvious categorical distinction between hard "physical" quantum phenomena and classically probabilistic ones is a fallacy. Quantum theory is in some sense just probability theory with more features.
More and more researchers are catching onto this and I think that's really exciting.
It would be like adding a classical computer we don't know how to turn on; it won't help anything if we don't know how to use it.
My gut feeling is that even if we use some probabilistic quantum compute it should be transferable to normal compute. Also animals vs human don't have enough difference to assume we are special.
However, given the scale of known brain features, the temperature of the brain, and other sources of noise, it's very very very hard to imagine how the brain could be a quantum computer.
"Quantum" is turning out to have very little to do with how tiny or physical stuff is, and much more to do with assumptions about how observations emerge from interaction.
> rather than simply due to imperfect knowledge
That's the key right there... Quantum probability and quantum models don't rely on the existence of "perfect knowledge" or "underlying" objective states. States themselves are intrinsically probabilistic and contextually embedded. Measurements/observations cannot be cleanly decoupled from the states being measured, and are modelled as projections from a high dimensional space of possibilities onto some lower dimensional subspace.
This kinda thing works really well for social/cognitive systems which are incredibly sensitive to measurement process. For example, when conducting polls or surveys, the ordering of the questions is well known to impact the outcome. It turns out that this can be very well modelled using tools from quantum theory, and it has been.
Check out this book and all the books/papers citing it for a window into this fascinating world
And Von Neumann entropy is Shannon entropy, as applied to quantum states.
To be clear, any quantum computation can be simulated on a classical computer, but it takes exponentially many steps. This is proven mathematically already, with the single exception that it's not proven that a better classical algorithm couldn't remote the exponential difference.
There is a missing piece of the puzzle all right, it's the huge role the environment and body play in developing intelligence. Everyone's focusing on quantum effects or just the brain forgetting that all they learn comes from the experience of the environment on the body.
The forces that shape and restrict life are the same that guide our learning process and evolution. We're looking too close to the brain and missing the big picture. Embodiment is the thing we're glossing over.
Quantum intelligence or consciousness seems like a detour, a blind walk into mysticism unless someone can prove there are things in neurology and AI that only make sense from a quantum perspective.
I think there is some definite upside to knowing the ultimate complexity of a single neuron.
Is it a detour? I don’t think we can say for sure either way until the question of how a single neuron works is settled.
Yes, they are, in that neurons are made up of molecules which are held together and bind due to quantum effects. But that's the limit of it.
Could you elaborate on this? Are you referring to a specific research result or waxing poetic?
We can barely model the humble nematode c. elegans with its 330 neurons.
By the butterfly principle, sand walls on the beach do alter the world's tides.
What is not missing is metaphysical quantum woo[0]. We can, have, and do observe how neurons and synapses function; the issue is that the computational complexity with current approaches is great enough to make complete neurological modeling of a microscopic worm with under a thousand total cells difficult.
[0] and even if you want to take those effects into account in your electrochemical models, all they do is turn them into stochastic models. there is exactly zero evidence of that randomness being of any real value to the thing we care about, which is the emergent macro scale properties of these systems.