OpenAI's plans according to sama(humanloop.com) |
OpenAI's plans according to sama(humanloop.com) |
Most probably this is driven by their use of it in ChatGPT, which is on fire from PMF. Clearly they're experimenting with the cheaper GPT-4 in ChatGPT right now as it's fairly turbo now, as discussed earlier today.
Just when I went back to the post for some quote material...
How many shell corporations are intelligence agencies seeding right now?
It's pretty hilarious and annoying to see bing start to write code only to self censor itself after a few lines (deleting what was there! no wonder these guys love websockets and dynamic histories)
Whoops!
The conspiracy theory isn't that every employee of OpenAI spent 8 hours every day for six months in meetings with govt agencies.
Please make more effort next time than to provide me with a Wiki article.
I tried it again tonight and it seems like they fixed it to only produce small amounts of mediocre code instead.
(Just to be clear, this is a hypothetical intelligence agent saying this, not me.)
I mean, it's not exactly rocket science, who wouldn't instantly fold to that?
(I mean, since we’re just making up wild hypotheticals)
Am I reading this right? "We're not open sourcing GPT-3 because we don't think it would be useful to anyone else"
>> 4. OpenAI will avoid competing with their customers — other than with ChatGPT
On this I would not bet a dime.
There is plenty to criticise OpenAI for but what he and they have achieved is extraordinary, and there is no need for that sort of toxic personal attack.
Almost all of these companies have the technical ability, desire, and means to self-host for their employee community. Imagine the internal coup for CTO/CIOs everywhere to buy whatever is the latest Nvidia GPU cluster box, stick it in the on-prem datacenter, load a licensed GPT model and provide "AI as a service to our employees".
Except what's happening is everybody is looking at buying the box from Nvidia, and sticking a large actually open model on it and simply ignoring OpenAI.
Well, one of companies I worked for could have hosted a canary service for cron jobs. But we bought it instead of building because we were focused on building features. And here you’re talking about hosting an entire LLM.
Also OpenAI: Meta is pissing in our moat, let's drop a hint about open sourcing our shit too!
I think this is reasonable. Giving researchers access is great but for most small companies they're likely better off having a service provider manage inference for them rather than navigate the infra challenge.
“It’s too hard, trust us” doesn’t really make sense in that context. If it is indeed too hard for small orgs to self host then they won’t. Hiding behind the guise of protecting these people by not open sourcing it seems a bit disingenuous.
"I'm not sharing my chocolate with you because you probably wouldn't like it"
That said, I can imagine a GPTQ/4-bit quantized model to be smaller and easier to run on somewhat commodity clusters?
Or it could run with GGML/llama.cpp on a cloud instance with a TB of RAM?
After seeing what people were able to do with LLaMA, I am positive that the community will find a way to run it - albeit with some loss in performance.
It would be truly amazing if they used their computing to develop quantized models as well.
That’s not ideal
How does open source licensing work with respect to trained ai models anyway? Is something like the mit license even that valuable here? Or is it?
Am I somehow being protected by a benevolent sama not open-sourcing the model?
The great thing about open source is that people can try different approaches and gravitate towards what works best for them. Sam knows that of course, he's just being disingenuous because the truth makes him look bad.
We don't provide nuclear weapons for everyone to keep in their basement, why would someone who believes AI is an existential risk provide their code?
this certainly aligns with the massive (albeit subjective and anecdotal) degradation in quality i've experienced with ChatGPT GPT-4 over the past few weeks.
hopefully a superior (higher quality) alternative surfaces before its unusable. i'm not considering continuing my subscription at this rate.
Which is all the more curious, considering OpenAI said this only in January:
> Azure will remain the exclusive cloud provider for all OpenAI workloads across our research, API and products [1]
So... OpenAI is severely GPU constrained, it is hampering their ability to execute, onboard customers to existing products and launch products. Yet they signed an agreement not to just go rent a bunch of GPU's from AWS???
Did someone screw up by not putting a clause in that contract saying "exclusive cloud provider, unless you cannot fulfil our requests"?
[1]: https://openai.com/blog/openai-and-microsoft-extend-partners...
If you understand the shape of the power law scaling curves, shouldn't this scaling hypothesis tell you that AGI is not close, at least via a path of simply scaling up GPT-4? For example, the GPT-4 paper reports a 67% pass-rate on the HumanEval benchmark. In Figure 2, they show a power-law improvement on a medium-difficulty subset as a function of total compute. How many powers of ten are we going to increase GPT-4 compute by just to be able to solve some relatively simple programming problems?
Edited: Don't know if is a good thing to study the weak points of closed LLMs. Even asking LLMs can give hints about possible ways to improve. In my case I am happy I am certainly old and my mind is a lot weaker than before, but even in this case I prefer not to use LLMs for gaining insight because she will someday get a better insight than myself. But the lust of knowledge is a mortal sin.
100x GPT-4 to 85%.
(quoting from the GPT-4 paper):
>All but the 15 hardest HumanEval problems were split into 6 difficulty buckets based on the performance of smaller models. The results on the 3rd easiest bucket are shown in Figure 2
You'll know when the problem is solved when model after consistently use a method. Until then (and especially if you're not in the field as a researcher), assume that every paper claiming to tackle context length is simply a nice proposal.
Can anyone elaborate on this? This is a big issue for me.
Technically he can claim that OpenAI will not release competing products while Microsoft plugs AI into everything.
Microsoft just announced at Build 2023 that they'll have OpenAI tech integrated with: Windows, Bing, Outlook, Word, Teams, Visual Studio, Visual Studio Code, Microsoft Fabric, Dynamics, GitHub, Azure DevOps, and Logic Apps. I probably missed a bunch.
Very soon now, everything Microsoft sells will have OpenAI integration.
Unless you're selling a niche product too small for Microsoft to bother with, you're competing directly against OpenAI.
Oh, and to top it off: Microsoft can use GPT 4 all they want, via API access. Third parties have to beg and plead to get rate-limited access. That access can be withdrawn at any time if you're doing something unsafe to OpenAI's profit margins.
"Please Sir Sam, may I have some GPT please?"
"No."
Haha having just finished the Wheel of Time, I'm super tickled by this reference.
It doesn't seem to be too common, only two uses of it on HN in the past year (at least, found by searching for the phrase "Aes Sedai")
The frontier is the multimodal versions of GPT-4 which he just said wasn't even going to public release until next year. Or whatever they are on now which they are carefully not calling GPT-5.
It sounds a little too much sci-fi for me, but I guess he knows better.
There’s been quite a bit happening in the programming space since sept 2021.
I use GPT to keep things high level and then do my normal research methodology for implementation details.
My understanding is right now they essentially need to train a new model on a new updated corpus to fix this, but maybe some other techniques could be devised...or they'll train something more up to date.
For smaller projects that will fit, I've taken to: `xclip *` and then pasting the entire collection of files into ChatGPT before describing what I want to do.
>The scaling hypothesis is the idea that we may have most of the pieces in place needed to build AGI and that most of the remaining work will be taking existing methods and scaling them up to larger models and bigger datasets. If the era of scaling was over then we should probably expect AGI to be much further away. The fact the scaling laws continue to hold is strongly suggestive of shorter timelines.
There will be no 'AI model' that is 'AGI', rather, a large swath of different technologies and models, operating together, will give the appearance of 'AGI' via some kind of interface.
It will not appear as an 'automaton' (aka single processing unit) and it certain will not be an 'aha moment'.
In 10 years, you'll be able to ask various agents, of different kinds, which will use varying kinds of AI to interpret speech, to infer context, which will interface with various AI APIs, in many ways it'll resemble what we have today but with more nuance.
The net appearance will evolve over time to appear a bit like 'AGI' but there won't be an 'entity' to identify as 'it'.
If this were true the debate would be a hell of lot easier. Unfortunately, it is not.
For what it's worth I've found the model actually performs significantly worse at most tasks when given access to browsing, in part because it relies on that instead of its own in built knowledge.
I haven't found a good way to have it only access the web for specific parts of its response.
The page now says "This content has been removed at the request of OpenAI." I wonder why they did it.
It's fine to advocate for a redefinition but be explicit about it.
The suspicion[0] is that OpenAI trained their models on a large text dump including libgen (in the so-called "books2").
If a person downloads a book from Library Genesis, they're a pirate; if OpenAI does it, so are they.
[0] https://twitter.com/theshawwn/status/1320282152689336320
A bit sad to hear that the multimodal model will only come next year, was hoping to get it this year
100k to 1 Million context length, sounds phenomenal especially if it comes to GPT4. I've used Claudes 100k context length and I found it so useful that when I have large documents I just default to Claude now
I think getting access to Claude through slack is much easier and I recently got it by just downloading it as a Slack App
Will they use copilot(s) to improve the models? Yes, but they have been doing that since 2021 already (the release year of GitHub Copilot).
>tell AI to make itself more efficient by finding performance improvements in human written code
>that newly available processing power can now be used to find more ways to improve itself
>flywheel effect of AI improving itself as it gets smarter and smarter
eventually you'd turn it loose on improving the actual hardware it runs on. I think the question now is really how far transformers can be taken and if they are really the path to "real" AI.
Also don't confuse all other types of human/animal characteristics like sentience with intelligence. They are different things. Things like sentience, subjective stream of experience, or other aspects of being alive don't just accidentally fall out of larger training datasets.
And we should be glad. The models are going to be orders of magnitude faster (and perhaps X times higher IQ) than humans within a few years. It is incredibly foolish to try to make something like that into a living creature (or emulation of living).
This would be huge for many applications, as "chatting" with GPT-4 gets really, really expensive very quickly. I've played with API with friends, and winced as I watched my usage hit several dollars for just a bit of fun.
Probability mass functions? Anyone know what this means in this context?
Dozens of people using it daily for coding and conversations and review in a month might be a couple hundred bucks. All day convo, constantly, as fast as it can respond, might add up to $5.
Not sure what kind of convo you're having that you could hit $10 unless you're parallelizing with something like the "guidance" tool or langchain.
And yes, parallelism and loops are also key enablers for advanced use-cases.
For example, I have a lot of legacy code that needs uplifting. I'd love to be able to run different prompts over reams of code in parallel, iterating the prompts, etc...
The point of these things is that they're like humans you can clone at will.
The ability to point thousands of these things at a code base could be mindblowing.
I know a lot of LLM stuff has either been released or leaked out, but don't have enough expertise in this area to understand the competitive advantages or breakthroughs OpenAI has obtained.
Instruction tuned LLaMA 65B/Falcon 40B are good, especially with an embeddings database.
...But OpenAI has all the name recognition and ease of use now, so it might not even matter if others ambiguously surpass OpenAI models.
I have had random runs of good days and bad days since starting to use chatGPT.
They have likely been subsidizing their users since the launch of their commercial offering (and this is pretty common strategy for SV startups) but they've been so successful that they now need to scale the cost down in order not to burn all their cash too fast.
https://www.youtube.com/watch?v=Rk3nTUfRZmo&t=5s "What runs ChatGPT? Inside Microsoft's AI supercomputer"
The relevance here is that Azure appears to be very well designed to handle the hardware failures that will inevitably happen during a training run taking weeks or months and using many thousands of GPUs... There's a lot more involved than just renting a bunch of Amazon GPUs, and anyways the partnership between OpenAI and Microsoft appears quite strategic, and can handle some build-out delays, especially if they are not Microsoft's fault.
Don't assume Microsoft is bad at everything and that AWS is automatically superior at all product categories...
> Did someone screw up by not putting a clause in that contract saying "exclusive cloud provider, unless you cannot fulfil our requests"?
Maybe MSFT refused to sign such an agreement?
All the cloud providers are building out this type of capacity right now. It's already having a big impact in terms of quarterly spend, which we just saw in the NVDA Q1 results. AWS, Azure, and GCP for sure, but also smaller players like Dell and HPE and even NVidia themselves are trying to get into this market. (Disclaimer: I work at one of these places but don't feel like saying which). I suspect the GPU constraints won't be around too long, at which point we'll find out if OpenAI made a contractual mistake.
I think that there aren't a lot of GPUs available and it takes time to add more to the datacenter even when you do get them.
that absolutely isn’t an attempt to slow down all competition.
which isn’t necessary because nobody made such a mistake.
this won’t lead to any hasty or reckless internal decisions in a feckless effort to stay in front.
not that any have already been made.
not that that could lead to disaster.
I know it's a joke, but the hole is the god-AI couldn't have been that smart, since cryptocurrency-mining quickly switched to ASICs, which muting the demand increase for GPUs.
The talk/conversation appeared to me not as OpenAI future plan but more on the CEO lamenting on how severely limited the company by the GPU or the lack thereof. It just a cheeky ploy by a CEO of an AI company that currently at 30B USD valuation to get more money in order to buy several [fill in the blank] of these most advanced GPU systems [2].
[1]Nakamoto's Neighbor: My Hunt For Bitcoin's Creator Led To A Paralyzed Crypto Genius:
https://www.forbes.com/sites/andygreenberg/2014/03/25/satosh...
[2]Nvidia DGX GH200: 100 Terabyte GPU Memory System:
The reason companies shun OpenAI and want a self hosted alternative isn't related to costs, it's becasue they don't want their code, internal emails, documentation etc to be uploaded to Microsoft and thus also directly to the NSA.
But they do to Slack and MS Teams. Also go mail services and other places.
It’s possible to think squirt guns shouldn’t be regulated but AR-15s should, or AR-15s shouldn’t but cruise missiles should. Or driving at 25mph should be allowed but driving 125mph shouldn’t.
I'm reading the web serial Pact right now, where the main character can lie but it costs him dearly each time: https://pactwebserial.wordpress.com/
I also really enjoyed the ending of the Confederation Trilogy by Peter F Hamilton, which revolved around the inability of the Tyrathca race to tell lies.
PS: Just for laughs, I used to practice talking like an Aes Sedai, never telling an outright lie while actively deceiving people. It's an interesting skill to acquire and surprisingly easy. Once you learn how to do it, you'll never see a press conference or a political speech the same way ever again.
I also haven't read Confederation, but I have read a different work for Hamilton (The Commonwealth Saga). I actually thought it was a bit too long, but it's one of the series I think about most often (so many great ideas and characters in it).
They are 'next word prediction models' which elicit some kinds of reasoning embedded in our language, but it's a crude approximation at best.
The AGI metaphors are Ayahuasca Koolaid, like a magician duped by his own magic trick.
There will be no AGI, especially because there will be not 'automaton' aka distinct entity that elicits those behaviours.
Imagine if someone proposed 'Siri' were 'conscious' - well nobody would say that, because we know it's just a voice-based interface onto other things.
Well, Siri is about to appear much smarter thanks to LLMs, and be able to 'pass the bar exam' - but ultimately nothing has fundamentally changed.
Whereas each automaton in the human world had it's own distinct 'context' - the AI world will not have that at all. Context will be as fleeting as memory in RAM, and it will be across various systems that we use daily.
It's just tech, that's it.
> elicit some kinds of reasoning
I know it's hard, but you have to choose here. Are they reasoning or are they not reasoning?
> next word prediction models
238478903 + 348934803809 = ?
Predict the next word. What process do you propose we use here? "Approximately" reason? That's one hell of a concept you conjured up there. Very interesting one. How does one "approximates" reason and what makes it so that the approximation will forever fail to arrive at its desired destination?
> Whereas each automaton in the human world had it's own distinct 'context' - the AI world will not have that at all. Context will be as fleeting as memory in RAM, and it will be across various systems that we use daily.
Human context is fleeting as well. Time, dementia and ultimately death can attest to that. Even in life identity is complicated and multifaceted without singular I. For all intents and purpose we too are composed of massive amounts of loosely linked subsystems vaguely resembling some sort of unity. I agree with you on that one. General intelligence IMO probably requires some form of cooperation between disparate systems.
But you see some sort of fundamental difference here between "biology" and "tech" that I just cannot. If RAM was implemented biologically, would it cease to be RAM? I fail to see what's so special about the biological substrate.
To be clear, I'm not saying LLMs are AGI, but I have a hard time dismissing the notion that some combination of systems - of which LLMs might be one - will result in something we just have to call generally intelligent. Biology just beat us to it, like it did with so many things.
> It's just tech, that's it.
The human version is: it's just biology, that's it. What's the purpose of stating that?
I disagree, language is all we need. Agency? Encode your “internal needs” as prompts, periodically generate prefixes from these, append them to incoming prompts. Self-awareness? Summarize this internal dialogue, reflect on it with a few iterations, add the results to the common prefix. Sentience? Attach some sensors, summarize their observations with the language model, prepend to prefix. Actions? Make the model output commands that some servos or other interfaces understand. Etc, etc.
And, of course, it would be extremely _cool_ to make something like that into a living creature, and lots of labs are already doing that. Fear and luddism should not stay in the way of curiosity.
If we humans cannot improve our own intelligence, making something smarter than us is an evolutionary imperative.
It is almost certain that the next stage of intelligence will be digital. But it is very foolish and unnecessary to try to speed that along.
It is likely that we have a century or two max left in control of the planet, regardless of what we do. On some level I agree that totally suppressing it indefinitely would be a shame.
When I said "living" I meant digital life. Such as those things you describe and others including control-seeking, self-preservation, and reproduction which are all central to living beings.
The problem is that AI will soon think 100 or more times faster than humans. This is anticipated based on the history of increases in computing efficiency and the fact that we are now optimizing a very specific system (LLMs). Humans will not in any way be able to keep up.
This is not luddism. I have a service that connects GPT-4 to Linux VMs on the internet to install or write software. I think this technology is great and has a lot of positive potential.
But when you deliberately try to emulate animals (like humans) and combine that with hyperspeed and other superintelligent characteristics, you are essentially approaching suicide or at least, abdicating all responsibility for your environment. There is no way to prevent such a thing from making all of your decisions for you.
The speed difference will be incredible. Imagine a bullet time scene where everyone seems to be moving in extreme slow motion. Now multiply by 10 so they are so slow they seem completely frozen.
This level of performance is coming in five years or less.
While I don't want to suppress the evolution of intelligent life in our corner of the universe, I also am not ready to join a death cult. Especially not accidentally.
https://www.technologyreview.com/2022/04/06/1048981/worldcoi...
and loopt
We just haven't figured out how yet.
I love it. It's dangerous. Hypocrisy? No. It's the tendency of man to pursue what is harmful to himself. Even though optimistic technologists of HN won't agree, much of technology has not benefitted humanity.
Also he's already half way to being a billionaire anyway without OpenAI.
Did the GPU manufactures ever embrace cryptocurrency? IIRC, they actually tried to discourage it (e.g. by butting throttling into mass market models to discourage their use for computation).
Also, the graphs here show a long-term downward trend, with only a short-term sales blip 5 years ago due to cryptocurrency: https://www.tomshardware.com/news/sales-of-desktop-graphics-....
Just to name a few families of approaches: Sparse Attention, Hierachical Attention, Global-Local Attention,Sliding Window Attention, Locality sensitive hashing Attention, State space model, EMA gated attention.
Notably, human working memory isn't great either. Which begs the question (if the comparison is valid) as to whether that limitation might be fundamental.
It's $20 a month and comes with 300 GPT-4 messages and 1000 Claude 1.2 messages.
By comparison, ChatGPT Plus gives gives you up to 6000 GPT-4 messages a month for the same price (admittedly it would be hard to use that many as they are given in 3 hour blocks).
In a nutshell, part of your llm prompt (usually your most recent question?) gets fed as a query for the embedding/vector database. It retrieves the most "similar" entries to your question (which is what an embedding database does), and that information is pasted into the context of the llm. Its kinda like pasting the first entry from a local Google search into the beginning of your question as "background."
Some implementations insert your old conversations (that are too big to fit into the llm's context window) into the database as they are pushed out.
This is what I have seen, anyway. Maybe some other implementations do things better.
How is it embedded? Using a separere embedding model, like Bert or something? Or do you use the LLM itself somehow? Also, how do you create content for the vector database keys themselves? Also just some arbitrary off the shelf embedding? Or do you train it as part of training the LLM?
You do not need a fancy cloud hosted service to use an embeddings database like you do not need one to use a regular databse (although you could).
Check https://github.com/kagisearch/vectordb for a simple implementation of a vector search database that uses local, on-premise open source tools and lets you use an embeddings database in 3 lines of code.