ChatGPT for Teams(openai.com) |
ChatGPT for Teams(openai.com) |
Last one I remember was OpenAI GPT-4 API
Now normal software is priced to squeeze as much money as you can, enterprises can afford more, so are charged more. Individuals are highly price sensitive, so has to be very cheap.
GenAI is quite different in that its not 0 marginal cost, the marginal costs are probably at least 50% of the price. So the price difference between enterprise and individual plans will be far smaller than usual, due to the common cost base.
WinRAR is 30 EUR per user when buying a single license, 9 EUR when buying 100 licenses.
Until I realized Perplexity will give you a decent amount of Mistral Medium for free through their partnership.
Who is sama kidding they’re still leading here? Mistral Medium destroys the 4.5 preview. And Perplexity wouldn’t be giving it away in any quantity if it had a cost structure like 4.5, Mistral hasn’t raised enough.
Speculation is risky but fuck it: Mistral is the new “RenTech of AI”, DPO and Alibi and sliding window and modern mixtures are well-understood so the money is in the lag between some new edge and TheBloke having it quantized for a Mac Mini or 4070 Super, and the enterprise didn’t love the weird structure, remembers how much fun it was to be over a barrel to MSFT, and can afford to dabble until it’s affordable and operable on-premise.
“Hate to see you go, love to watch you leave”.
- mixtral-8x7 or 8x7: Open source model by Mistral AI.
- Dolphin: An uncensored version of the mistral model
- 3.5-turbo: GPT-3.5 Turbo, the cheapest API from OpenAI
- 4-series preview OR "4.5 preview": GPT-4 Turbo, the most capable API from OpenAI
- mistral-medium: A new model by Mistral AI that they are only serving through AI. It's in private beta and there's a waiting list to access it.
- Perplexity: A new search engine that is challenging Google by applying LLM to search
- Sama: Sam Altman, CEO of OpenAI
- RenTech: Renaissance Technologies, a secretive hedge fund known for delivering impressive returns improving on the work of others
- DPO: Direct Preference Optimization. It is a technique that leverages AI feedback to optimize the performance of smaller, open-source models like Zephyr-7B1.
- Alibi: a Python library that provides tools for machine learning model inspection and interpretation2. It can be used to explain the predictions of any black-box model, including LLMs.
- Sliding window: a type of attention mechanism introduced by Mistral-7B3. It is used to support longer sequences in LLMs.
- Modern mixtures: The process of using multiple models together, like "mixtral" is a mixture of several mistral models.
- TheBloke: Open source developer that is very quick at quantizing all new models that come out
- Quantize: Decreasing memory requirements of a new model by decreasing the precision of weights, typically with just minor performance degradation.
- 4070 Super: NVIDIA 4070 Super, new graphics card announced just a week ago
- MSFT: Microsoft
I've set up my system to use several AI models: the open-source Mixtral-8x7, Dolphin (an uncensored version of Mixtral), GPT-3.5 Turbo (a cost-effective option from OpenAI), and the latest GPT-4 Turbo from OpenAI. I can easily compare their performances in Emacs. Lately, I've noticed that GPT-4 Turbo is starting to outperform Mixtral-8x7, which wasn't the case until recently. However, I'm still waiting for access to Mistral-Medium, a new, more exclusive AI model by Mistral AI.
I just found out that Perplexity, a new search engine competing with Google, is offering free access to Mistral Medium through their partnership. This makes me question Sam Altman, the CEO of OpenAI, and his claims about their technology. Mistral Medium seems superior to GPT-4 Turbo, and if it were expensive to run, Perplexity wouldn't be giving it away.
I'm guessing that Mistral AI could become the next Renaissance Technologies (a hedge fund known for its innovative strategies) of the AI world. Techniques like Direct Preference Optimization, which improves smaller models, along with other advancements like the Alibi Python library for understanding AI models, sliding windows for longer text sequences, and combining multiple models, are now well understood. The real opportunity lies in quickly adapting these new technologies before they become mainstream and affordable.
Big companies are cautious about adopting these new structures, remembering their dependence on Microsoft in the past. They're willing to experiment with AI until it becomes both affordable and easy to use in-house.
It's sad to see the old technology go, but exciting to see the new advancements take its place.
Love how deep the rabbithole has gone in just a year. I am unfortunately in the camp of understanding the post without needing a glossary. I should go outside more :|
[1]: https://arxiv.org/abs/2108.12409
[2]: n.b. Ofir Press is co-creator of ALiBi https://twitter.com/OfirPress/status/1654538361447522305
(but seriously: Thanks !)
Also mixtral medium - no idea of what he means by that.
Not to mention a claim that mixtral is as good as gpt-4. It’s on the quality of gpt3.5 at best, which is still amazing for an open source model, but a year behind openai
For a broad introduction to the field Karpathy's YouTube series is about as good as it gets.
If you've got a pretty solid grasp of attention architectures and want a lively overview of stuff that's gone from secret to a huge deal recently I like this treatment as a light but pretty detailed podcast-type format: https://arize.com/blog/mistral-ai
Hilariously neither knows who is sama (Sam Altman, the Drama King of OpenAI), nor do they recognize when they themselves are being discussed.
Reading the responses in full also gives you a glimpse on specific merits or weaknesses of these systems, namely how up to date is their knowledge and lingo, explaining capabilities, and ability to see through multiple layers of referencing. Also showcases whether the AIs are willing to venture guessing to piece together some possible interpretation for hoomans to think about.
Basically, he said he is happy with Mistral 8x7B and thinks it is on par/better comparing to OpenAI's closed source model.
On what metrics? LMSys shows it does well but 4-Turbo is still leading the field by a wide margin.
I am using 8x-7b internally for a lot of things and Mistral-7b fine-tunes for other specific applications. They're both excellent. But neither can touch GPT-4-turbo (preview) for wide-ranging needs or the strongest reasoning requirements.
https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...
EDIT: Neither does mistral-medium, which I didn't discuss, but is in the leaderboard link.
There's also very little if any credible literature on what constitutes statistically significant on MMLU or whatever. There's such a massive vested interest from so many parties (the YC ecosystem is invested in Sam, MSFT is invested in OpenAI, the US is invested in not-France, a bunch of academics are invested in GPT-is-borderline-AGI, Yud is either a Time Magazine cover author or a Harry Potter fanfic guy, etc.) in seeing GPT-4.5 at the top of those rankings and taking the bold one at < 10% lift as state of the art that I think everyone should just use a bunch of them and optimize per use case.
I have my own biases as well and freely admit that I love to see OpenAI stumble (no I didn't apply to work there, yes I know knuckleheads who go on about the fact they do).
And once you factor in "mixtral is aligned to the demands of the user and GPT balks at using profanity while happily taking sides on things Ilya has double-spoken on", even e.g. MMLU is nowhere near the whole picture.
It's easy and cheap to just try both these days, don't take my word for which one is better.
Goliath is too big for my system but Mixtral_34Bx2_MoE_60B[1] is giving me some really good results.
PSA to anyone that does not understand what we're talkign about: I was new to all of this until two weeks ago as well. If you want to get up to speed with the incredible innovation and home-tinkering happening with LLMs, you have to checkout - https://www.reddit.com/r/LocalLLaMA/
I believe we should be at GPT4 levels of intelligence locally sometime later this year (Possibly with the release of Llama3 or Mistral Medium open-model).
[1] - https://huggingface.co/TheBloke/Mixtral_34Bx2_MoE_60B-GGUF
"My hips don't lie."
The trouble with the jargon is that it obfuscates to a high degree even by the standards of the software space, and in a field where the impact on people's daily lives is at the high end of the range, even by the standards of the software space.
HN routinely front-pages stuff where the math and CS involved is much less accessible, but for understandable reasons a somewhat tone-deaf comment like mine is disproportionately disruptive: people know this stuff matters to them either now or soon, and it's moving as quickly as anything does, and it's graduate-level material.
If you have concrete questions about what probably looks like word salad I'll do my best to clarify (without the aid of an LLM).
I might not know half of the references like "sama" or "TheBloke", but I could understand the context of them all. Like:
"the lag between some new edge and TheBloke having it quantized for a Mac Mini or 4070 Super,"
Not sure who TheBloke is, but he obviously means "between some new (cutting) edge AI model, and some person scaling it to run on smaller computers with less memory".
Similarly, not sure who Perplexity is, but "Until I realized Perplexity will give you a decent amount of Mistral Medium for free through their partnership" basically spells out that they're a service provider of some kind, that they have partnered with Mistral AI, and you get to use the Mistral Medium model through opening a free account on Perplexity.
I mean, duh!
Basically let an AI hallucinate on some technical subject. It would make a great script for a new encabulator video.
I'm curious, because I'm gathering some usecases; so that I could share that internally in the company to provide better education on, what LLMs do and how they work.
It's a great tool they make available.
While I heavily rely on `emacs` as my primary interface to all this stuff, I'm slowly-but-surely working on a curated and opinionated collection of bindings and tools and themes and shit for all the major hacker tools (VSCode, `nvim`, even to a degree the JetBrains ecosystem). This is all broadly part of a project I'm calling `hyper-modern` which will be MIT if I get to a release candidate at all.
I have a `gRPC` service that wraps the outstanding work by the "`ggeranov` crew" loosely patterned on the sharded model-server architectures we used at FB/IG and mercilessly exploiting the really generous free-plan offered by the `buf.build` people (seriously, check out the `buf.build` people) in an effort to give hackers the best tools in a truly modern workflow.
It's also an opportunity to surface some of the outstanding models that seem to have sunk without a trace (top of mind would be Segment Anything out of Meta and StyleTTS which obsoletes a bunch of well-funded companies) in a curated collection of hacker-oriented capabilities that aren't clumsy bullshit like co-pilot.
Right now it's a name and a few thousand lines of code too rough to publish, but if I get it to a credible state the domain is `https://hyper-modern.ai` and the code will be MIT at `https://github.com/hyper-modern-ai/`.
Also, is anyone aware of a service that supplies API endpoints for dolphin? I'd love to experiment with it, but running locally exceeds my budget.
To my knowledge, and I searched to confirm, GPT-4.5 is not yet released. There were some rumors and a link to ChatGPT's answer about GPT-4.5 (could also be a hallucination) but Sam tweeted it was not true.
In all seriousness, are self hosted GPT alternatives really viable?
Do you have a source on Mistral/Mixtral using that?
It was just an example of a modern positional encoding. I regret that I implied inside knowledge about that level of detail. They're doing something clever on scalar pointwise positional encoding but as for what who knows.
I guess you can argue this is just a marginal add-on to their existing ChatGPT product but I can imagine seeing them go full Salesforce/Oracle/enterprise behemoth here.
I would say I'm very pro AI development and pro Sam reinstating but I've been starting to shake my head a bit. Their mission and their ambition are wildly different.
The mission changed when research ran into product market fit.
The sooner we build a tool to filter out garbage the better.
FTFY
People could work around it but it might help
We shouldn't assume a basic sentence capitalised word refers to a product.
If a reference to a product is intended, we should clarify that association some other way; i.e. MS Teams.
What's happening is that lowercase "sentence case" titles have become more popular and normalized so repeated exposure to that style can cause a subconscious heuristic of "Capitalized letter signifies a Brand Name or Proper Noun". You can try to advise people not to assume that but it doesn't change the type of "sentence case" titles people are now repeatedly exposed to.
The New York Times still uses "Title Case" but a lot of other newspapers switched to lowercase sentence case. Washington Post switched in 2009. And Los Angeles Times, The Boston Globe, the Chicago Tribune, the San Francisco Chronicle, Philadelphia Inquirer, etc all followed.
Other popular websites with lowercase titles include Vox, ArsTechnica, TechCrunch, etc.
Normally it would be ok to capitallize words to match what many other US publications use, but this capitalization introduces confusion. I can only speak for myself, but I made the assumption that this was an integration with MS Teams. This would have been avoided if the original title was kept.
I do agree that editors should read their topics critically and add disambiguating text (if possible and permitted).
We shouldn’t but many of us do. As a title word, there’s ambiguity if it’s a proper noun or not given title styling. Given the context in this case (HN, OpenAI, ChatGPT) it was pretty difficult for my brain to not assume it was referring to Microsoft Teams so it baited me in, perhaps unintentionally. I’m not too upset about it because I knew that going in but none the less, a quick read of the title should make it obvious to call it “ChatGPT for Collaboration” or something of that nature.
Claiming it is a title wouldn’t win the argument either. As it is not a rule that titles must have title casing. Both (title case vs first letter capital only) are valid typography of a title in English.
When a common word is a product / brand, how do you use that word in a title without bringing up associations with that product / brand?
Isn't the guidelines to submit the title as-is, no editorializing?
Are you also going to complain if someone releases a platform called "The"?
j/k but finding it pretty funny these days that more and more people are switching to lowercase, assuming it started from this @sama tweet: https://twitter.com/sama/status/1735123080564167048
On HN, copy/paste is usually a safe bet (with a few exceptions): https://news.ycombinator.com/newsguidelines.html
If you mean in general, Wikipedia sums it up pretty well: https://en.wikipedia.org/wiki/Title_case
I'm with the parent comment on this one: for this particular headline it makes sense to lowercase "teams", since the HN audience will tend to correlate the uppercase version with the incorrect meaning of the headline.
I screen-capped my take on this to prove* that I was actually wiring all this stuff up and plug my nascent passion/oss project, but it's really funny comparing them either way: https://imgur.com/WDrqxsz
It's "unofficial" but "literally made it up" seems a bit unfair, it's not like I called it `GPT-4-Ti Founders Edition` and tried to list it on eBay.
Anything else is just wasting time.
Unfortunately, I don’t think I’m the only one.
In more intuitive terms, your bog-standard transformer overdoes it in terms of considering all context equally in the final prediction, and we historically used rather blunt-force instruments like causally masking everything to zero.
These techniques are still heuristic and I imagine every serious shop has tweaks and tricks that go with their particular training setup, but the Rope shit in general is kind of a happy medium and exploits locality at a much cheaper place in the overall computation.
There are quite a few recent attention extension techniques recently published:
* Activation Beacons - up to 100X context length extension in as little as 72 A800 hours https://huggingface.co/papers/2401.03462
* Self-Extend - a no-training RoPE modification that can give "free" context extension with 100% passkey retrieval (works w/ SWA as well) https://huggingface.co/papers/2401.01325
* DistAttention/DistKV-LLM - KV cache segmentation for 2-19X context length at runtime https://huggingface.co/papers/2401.02669
* YaRN - aforementioned efficient RoPE extension https://huggingface.co/papers/2309.00071
You could imagine combining a few of these together to basically "solve" the context issue while largely training for shorter context length.
There are of course some exciting new alternative architectures, notably Mamba https://huggingface.co/papers/2312.00752 and Megabyte https://huggingface.co/papers/2305.07185 that can efficiently process up to 1M tokens...
.. which seemed to fit suprisingly well.
- I know r/LocalLlama, huggingface's Daily Papers and TheBloke. Most of what Youtube throws at me is horrific clickbait. I feel like there are probably whole communities I'm missing out on.
Also note that HN automatically applies the capitalization rules to submitted headlines. So you don’t really need to do any changes to the capitalization yourself when you submit.
And for understanding how the rules work, I agree that Wikipedia article is good.
Local style guide says not to capitalise headlines, presumably.
Though you are right, the general rule here is not to change headlines and that should probably apply to style as well as wording. Looking at other posts on the main list there seems to be a mix as the original styles (sentence cap, all words capitalised, all but articles, …) have been kept.
Look no further than “Add Comment” at the top of this page.
Title case was invented when there were much limited typographical options than now, to emphasize, well, titles. On a web page, there are so many better ways to do it, there is no reason to preserve the archaic convention.
To my eye it does not even look that nice...
"ChatGPT for Microsoft Teams", marketed under the name Copilot, has been a Microsoft Teams feature for a while. What new thing did you expect when you read this?
Mixtral-8x7: This appears to be a technical term, possibly referring to a software, framework, or technology. Its exact nature is unclear without additional context.
Dolphin locally: "Dolphin" could refer to a software tool or framework. The term "locally" implies it is being run on a local machine or server rather than a remote or cloud-based environment.
3.5-turbo: This could be a version name or a type of technology. "Turbo" often implies enhanced or accelerated performance.
4-series preview: Likely a version or iteration of a software or technology that is still in a preview or beta stage, indicating it's not the final release.
Emacs: A popular text editor used often by programmers and developers. Known for its extensibility and customization.
Mistral Medium: This might be a product or service, possibly in the realm of technology or AI. The specific nature is not clear from the text alone.
Perplexity: Likely a company or service provider, possibly in the field of AI or technology. They seem to have a partnership offering involving Mistral Medium.
RenTech of AI: RenTech, or Renaissance Technologies, is a well-known quantitative hedge fund. The term here is used metaphorically to suggest a pioneering or leading position in the AI field.
DPO, Alibi, and sliding window: These are likely technical concepts or tools in the field being discussed. Without additional context, their exact meanings are unclear.
Modern mixtures: This could refer to modern algorithms, techniques, or technologies in the field of AI or data science.
TheBloke: This could be a reference to an individual, a role within a community, or a specific entity known for certain expertise or actions.
4070 Super: This seems like a model name, possibly of a computer hardware component like a GPU (Graphics Processing Unit).
MSFT: An abbreviation for Microsoft Corporation.
On-premise: Refers to software or services that are operated from the physical premises of the organization, as opposed to being hosted on the cloud.
That's a weirdly dismissive statement. The fundamental problem is that a lot of these terms are from after the AI's cutoff point. It's perfectly able to handle terms like "Emacs", "RenTech" or "MSFT", and it can guess that "4070 Series" probably refers to a GPU.
ChatGPT in a few years will probably be perfectly able to produce the correct answers.
(Actually, ChatGPT consistently claims its current cutoff is April 2023, which should let it give a better answer, so I'm taking a few points off my explanation. But it still feels like the most probable one.)
"4070 Super": https://chat.openai.com/share/0aac7d90-de65-41d0-9567-8e56a0...
"Mixtral-8x7": https://chat.openai.com/share/8091ac61-d602-414c-bdce-41b49e...
Give me a hand?
am just here to vibe in my down time
lulz.
Your reply is adding more confusion.
The "ChatGPT Team" is the title of the webpage from a user comment: https://news.ycombinator.com/item?id=38942936
(The capitalized "Team" is part of a Branded Product Name.)
The "ChatGPT for teams" (lowercase 't') is the actual original title of the submitted webpage for the whole thread: https://openai.com/chatgpt/team
(The lowercase "teams" is a generic noun to describe the intended users.)
Sorry, I think you're the one adding confusion.
"ChatGPT for teams" is not the title of the submitted webpage. It's the heading of the article on that page, but the actual title, as shown in the tab strip and defined in the html header, is "ChatGPT Team":
view-source:https://openai.com/chatgpt/team
<meta property="og:title" content="ChatGPT Team">
Hacker New's code uses the html title element (which in this case is actually a meta element) to automatically title submissions.Yes, I understand that but 99% in this thread talking about "title" is the article's headline(title) that has the word "for" in it "ChatGPT for teams" and not the HTML tags <title>ChatGPT Team</title> or <meta property="og:title">
In other words, people in this thread are not complaining about "ChatGPT Team" in the browser's tab title or confused by it. Instead, they're talking about the other title "ChatGPT for Teams" that was submitted and manually changed from lowercase 't' to uppercase 'T' and prominently visible at the top of this thread. That's the context of this meta discussion about confusing capitalization causing some readers to incorrectly assume the headline is about a new ChatGPT addon feature for MS Teams: (https://www.microsoft.com/en-us/microsoft-teams/group-chat-s...)
The singular "Team" in product name "ChatGPT Team" doesn't cause the same confusion because Microsoft doesn't have a branded product called "Team". That's why citing "<title>ChatGPT Team</title>" ... does not help clarify things and just adds more confusion to this thread.
>Hacker New's code uses the html title element (which in this case is actually a meta element) to automatically title submissions.
Are you sure? In the past, HN commenters have been annoyed that the HN software does not extract and parse the HTML <title> element automatically and defers too much to the titles that submitters manually type in which often leads to editorialization and/or confusion.
When I read the headline, I was not thinking of MS Teams. Because I only use MS Teams a few rare times a year. Mainly I use Zoom.
But even if OpenAI had an article that with the rules of headline capitalization of HN ended up reading “Dall-E adds Zoom Feature” I would have imagined that it was about being able to zoom into pictures. Not automatically assumed that it had anything to do with Zoom. Even though I use Zoom almost every day.
Quite the contrary, Microsoft earns free advertisement from this. The problem is for the people that misunderstand the heading, not for Microsoft
Ok, you would understand it correctly even the Zoom example. So?
Heading should be optimized for the most possible amount of people, there is a clear possibility that a lot of people misunderstand the title, so why not simply edit it? In that case neither you nor anyone else would understand something wrong
I think people are vastly over estimating Microsoft’s ownership over ‘Teams’. Even within a tech context.
Plus, everything we’ve seen from Microsoft in their partnership with OpenAI has been Co-Pilot. Which is why I use MS Teams daily and did not make the connection.
And the above is substantially what I said, and undoubtedly would find a better reception with a larger audience.
I'm troubled though, because I already sanitize what I write and say by passing it through a GPT-style "alignment" filter in almost every interaction precisely because I know my authentic self is brash/abrasive/neuro-atypical/etc. and it's more advantageous to talk like ChatGPT than to talk like Ben. Hacker News is one of a few places real or digital where I just talk like Ben.
Maybe I'm an outlier in how different I am and it'll just be me that is sad to start talking like GPT, and maybe the net change in society will just be a little drift towards brighter and more diplomatic.
But either way it's kind of a drag: either passing me and people like me through a filter is net positive, which would suck but I guess I'd get on board, or it actually edits out contrarian originality in toto, in which case the world goes all Huxley really fast.
Door #3 where we net people out on accomplishment and optics with a strong tilt towards accomplishment doesn't seem to be on the menu.
I was pleasantly surprised to find a glossary immediately following, which tells me it wasn't the tone of the post, but the shorthand terminology that was unfamiliar to me that was my issue.
I think writing in "Ben's voice" is great. There are just going to be times when your audience needs a bit more context around your terminology, that's all.
For example, "in which case the world goes all Huxley really fast." "Huxley" apparently means something to you. Would it mean anything at all to someone who hasn't read any Aldous Huxley? As someone who _has_, I still had to think about it -- a lot. I assumed you're referring to a work of his literature rather than something he actually believed, as Huxley's beliefs about the world certainly had a place for the contrarian and the original.
Further, I assume you are referring to his most well-known work, _Brave New World_, rather than (for example) _Island_, so you're not saying that people would be eating a lot of psychedelic mushrooms and living together in tolerant peace and love.
I don't at all think you need to sound like GPT to be a successful communicator, but you will be more successful the more you consider your audience and avoid constructions that they're unlikely to be able to understand without research.
The moment I change the way I talk and say instead of "That's bullshit, let's move away from it" to "That could be a challenging and rewarding experience", and I can already see the advantage.
I rather like to talk the way I want, but I see it as challenging and not that rewarding as people seem to get more sensitive. That made me wonder if the way GPT-style chatbots communicate with humans would make humans expect the same way of communication from other humans.
What you are alluding to is quite similar to the that “instagram face” that everyone pursues and self filters for except its more about your communication and thoughts. Also the argument that you need to reach a wider audience i dint think isn't necessary unless you want the wider audience to comment and engage.
The internet is the great homogenizer soon(ish) we will be uniform.
"mixtral medium" is just a typo: he means mistral-medium.
And GPT 4.5 is certainly not an "invention of his". Whether it exists or not (which is debatable, OpenAI said it was just mentioned in a GPT 4 hallutination and caught on), it' s a version name thrown around for like a month in forums, blog posts, news articles and such.
Edit: I was using 2k window, a larger one would probably eat more ram. But even with 2k it didn’t feel like it loses context or something.
As these models can be quite large and memory intensive, if you want to just give it a quick spin, huggingface.co/chat, chat.nbox.ai, and labs.pplx.ai all have Mixtral hosted atm.
I literally use 8x-7b on my on-prem GPU cluster and have several fine tunes of 7b (which I said in the previous post). I've used mistral-medium.
GPT-4-turbo is better than them all on all benchmarks, human preference, and anything that isn't biased vibes. My opinion - such that it is - is that GPT-4-turbo is by far the best.
I have no vested interest in it being the best. I'd actually prefer if it wasn't. But all objective data points to it being the best and most lived experiences that are unbiased agree (assuming broad model use and not hyperfocused fine-tunes; I have Mistral-7b fine-tunes beating 4-turbo in very limited domains, but that hardly counts).
The rest of your post I really have no idea what's going on, so good luck with all that I guess.
That's a use case.
Certainly, no one here is arguing that there are things openai refuses to allow, and given that the effectiveness of using GPT4 on them is literally zero, a sweet potato connected to a spring and keyboard will "beat" GPT-4, if that's your scoring metric.
If you want a meaningful comparison you need tasks that both tools are capable of doing, and then see how effective they are.
Claiming that mistral medium beats it is like me claiming the RenderMan beats DALLE2 at rendering 3d models; yes, technically they both generate images, but since it's not possible to use DALLE2 to render a 3d model, it's not really a meaningful comparison is it?
I don't have any use cases for crime in my life at the moment beyond wanting to pirate like Adobe Illustrator before signing up for an uncancelable subscription, but it will do arbitrary things within it's abilities and it's google with a grudge in terms of how to do anything you ask. I stopped wanting to know when it convinced me it could explain how to stage a coup d'etat. I'm back on mixtral-8x7b.
Sorry but you're talking complete nonsense here. The benchmark by LMSys (chatbot arena) cannot be gamed, and Ravenwolf is a random-ass poster with no scientific rigor to his benchmarks.
The free version gets a lot of use around here but the most powerful feature is the ability to search the web, which is only available to paid users. I pay $20/month for myself and I’d happily pay a bit more for the whole family, but not $20/month per person - it adds up. Family members end up asking to borrow my phone a lot to use it.
Give me a 3-4 person plan that costs $30-$40/month. You’re leaving money on the table!
I showed the title to five colleagues and all of them assumed it referred to Teams the product. Not to mention the overwhelming majority of people in these threads.
I for one also thought it was Microsoft Teams in the heading.
If a heading misleads even a fraction of readership, for no apparent reason (what is the benefit of having a title in Title Case?), maybe it's better that the heading is changed, no?
Karpathy has a a great YouTube series where he gets into the details from `numpy` on up, and George Hotz is live-coding the obliteration of PyTorch as the performance champion on the more implementation side as we speak.
Altman being kind of a dubious-seeming guy who pretty clearly doesn't regard the word "charity" the same way the dictionary does is more-or-less common knowledge, though not often mentioned by aspiring YC applicants for obvious reasons.
Mistral is a French AI company founded by former big hitters at e.g. DeepMind that brought the best of the best on 2023's public domain developments into one model in particular that shattered all expectations of both what was realistic with open-weights and what was possible without a Bond Villain posture. That model is "Mixtral", an 8-way mixture of experts model using a whole bag of tricks but key among them are:
- gated mixture of experts in attention models - sliding window attention / context - direct-preference optimization (probably the big one and probably the one OpenAI is struggling to keep up with, probably more institutionally than technically as probably a bunch of bigshots have a lot of skin in the InstructGPT/RLHF/PPO game)
It's common knowledge that GPT-4 and derivatives were mixture models but no one had done it blindingly well in an open way until recently.
SaaS companies doing "AI as a service" have a big wall in front of them called "60%+ of the TAM can't upload their data to random-ass cloud providers much less one run by a guy recently fired by his own board of directors", and for big chunks of finance (SOX, PCI, bunch of stuff), medical (HIPAA, others), defense (clearance, others), insurance, you get the idea: on-premise is the play for "AI stuff".
A scrappy group of hackers too numerous to enumerate but exemplified by `ggerganov` and collaborators, `TheBloke` and his backers, George Hotz and other TinyGrad contributors, and best exemplified in the "enough money to fuck with foundation models" sense by Mistral at the moment are pulling a Torvalds and making all of this free-as-in-I-can-download-and-run-it, and this gets very little airtime all things considered because roughly no one sees a low-effort path to monetizing it in the capital-E enterprise: that involves serious work and very low shady factors, which seems an awful lot like hard work to your bog-standard SaaS hustler and offers almost no mega data-mining opportunity to the somnobulent FAANG crowd. So it's kind of a fringe thing in spite of being clearly the future.
It currently ranks 4 in chatbot arena leaderboard (slightly behind GTA-4 ELO rating): https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...
For software development I find that Phind is pretty good at combining search results with GPT-4 in a way that increases the quality of the result.
Maybe OpenAI can convince the Bing team to index everything using their embeddings. If ChatGPT could also read the text directly from Bind instead of having to "surf the web" it would be able to consume several search results at the same time. In the future I could even see Bing et al. running an LLM over all text when indexing a page to extract info and stats like a summary, keywords, truthfulness, usefulness, etc.
This is where I suspect Bard is going to be an absolute beast of a product. Ability to quickly and thoroughly consume a bunch of hits and find the best and summarize and such is something uniquely able for Google (and increasingly, Kagi)
It would be pretty rad if she could just have the app on her tablet with a family plan. She doesn't use it quite enough to justify getting her own subscription, but especially if we could share GPTs across devices, so she gets the ones I make for her, but doesn't get flooded with my work or research related GPTs.
BTW. I read once some person made automated generation of bed time stories (with childrens as the main characters) for his children using open AI API and speakers - I was quite amazed (not a thing I would do, but nice usage for gpt).
(Although I haven't yet myself tried any alternative that is clearly on par with ChatGPT 4)
I don’t even like that when my family picks up the remote, Apple TV assumes it’s me using the TV. They watch something and mess up my Up Next history and recommendations. I wish it supported using a PIN. I’ve thought about getting rid of the remote to force everyone to use their phone as a remote, because then it detects who is using it and automatically switches accounts. But that means everyone has to have an iPhone and have their phone charged, etc. Getting rid of the remote just for my convenience seems too inconsiderate.
they want you hooked on apps, API, etc, before the real costs are brought in. they likely should be charging anywhere from 50-100$ depending on hours
If you want an entirely free and open LLM experience, you can also run one of the ever-improving open source models on your own hardware. But for many if not most companies, paying $25/mo per user for something as amazing as ChatGPT-4 is a bargain.
Yea, another way to word it would be to imagine that they _only_ had a more expensive "no train" option. Now ask if it would be okay to offer a lower priced but yes-train version.
It would make more sense for them to just train on it anyway.
Alternatively, you could also use your own UI/API token (API calls aren't trained on). Chatbot UI just got a major update released and has nice things like folders, and chat search: https://github.com/mckaywrigley/chatbot-ui
But "we won't train from your data" is a powerful marketing line, and differentiator between classes of customer, even if they have no intention to train from the data of anyone.
On All In, they discussed the leverage from AI tools and they probably also meant open source, but one of the companies just rolled their own instance of a big monthly SaaS product because it was such a big expense for the startup.
- This is an essential, best-in-class tool. You wouldn't deny your employees a laptop or a free lunch, would you?
- $5/user/mo is a bargain compared to the hassle of building/hosting this yourself, punching holes in your firewall every time you need to receive a webhook, dealing with security and auth issues.
- $60k is half the cost of someone you don't need to hire on your in-house IT team. Does it make sense yet?
I'll take that bet ;) Not really sure about OpenAI, but you can absolutely negotiate with almost any company.
The thing is those same people need to be paid, and that’s a much (100x) larger bill, so the extra amount doesn’t really signify.
The higher bandwidth is to clearly entice new customers, but the question remains, what happens to the old ChatGPT Plus users? Do their quotas get eaten up by these new teams?
Chat History is off for this browser. When history is turned off, new chats on this browser won't appear in your history on any of your devices, be used to train our models, or stored for longer than 30 days. This setting does not sync across browsers or devices.
Aside: If you can see other colleagues' interactions with the custom/private GPTs, it could be quite an efficient way to share knowledge, especially for people in disparate time zones.
This is probably run on Microsoft servers (Azure, basically), not OpenAI servers, so it shouldn't directly compete for capacity. This is more of a "the pie got bigger" situation.
Hoping to see something good come out of this
Use cases I see are common ones - basic usage of ChatGPT but admin can control access. Provides ability for companies to bill directly instead of reimbursements, and have more control over it. HR docs and policies can be a separate GPT. Though nothing which requires multi level access control.
UI components can be generated as per your UI guidelines, same for tests. Hoping for good things
You have one non actionable marketing answer, a growth graph created without axis (what are people going to do with that?) and a Python file which would be easier just to run to get the error.
That kind of reinforce my belief that those AI tools aren't without their learning curves despite being in plain English.
Does this mean they will still use your data for other non-training purposes?
Does anyone know if this applies to voice conversations? This is me while I'm driving: upload big PDF -> talk to GPT: "Ok, read to me the study/book/article word for word."
Good job OpenAI.
Sorry where do you see that? I only see "higher usage limits"?
That article doesn't say 100.
100 is what I read in the openai forums earlier today.
It seems odd we have enterprise but cannot access GPT-4 through the main ChatGPT interface.
The former should have GPT-4 access; if not, that’s a bug, and I can look into it if you email me at atty@openai.com.
The API and ChatGPT are separate products, and usage or credits purchased for the API do not provide paid ChatGPT access.
My wife uses chatgpt only a few times a day.
I guess I need to 2x my browsers. I don't think this would work on the phone because I believe I need my browser open for chatgpt to continue its computations.
The GPT store
I was working on something at the end November that was proposing competent PRs based upon request for work in a GH issue. I was about halfway through the first iteration of a prompt role that can review, approve and merge these PRs. End goal being a fully autonomous software factory wherein the humans simply communicate via GH issue threads. Will probably be back on this project by mid February or so. Really looking forward to it.
Bigger, more useful context is all I think I really want at this point. The other primitives can be built pretty quickly on top of next token prediction once you know the recipes.
$25 per user/month billed annually
$30 per user/month billed monthly
https://openai.com/chatgpt/pricing It is very clear on the highlight.
* Higher message cap. * Create and Share GPTs within workspace. * Admin console. * No training.
You have set up your system to run different AI models and compare their performance using a text editor. You are using Mixtral-8x7, a high-quality open-source model developed by Mistral AI, Dolphin, an emulator for Nintendo video games, 3.5-Turbo, a customized version of GPT-3.5, a powerful natural language model, and 4-Series Preview, a new version of the BMW sports coupe. You have noticed that the 4.5-Preview, an upcoming update of GPT-3.5, is slightly better than Mixtral-8x7, which used to be a close match. You are still waiting to access Mistral-Medium, a prototype model that is even better than Mixtral-8x7, but only available to a limited number of users.
You have discovered that Perplexity, an AI company that provides information discovery and sharing services, offers free access to Mistral-Medium through their partnership with Mistral AI. You think that Perplexity is making a mistake by giving away such a valuable model, and that they are underestimating the superiority of Mistral-Medium over the 4.5-Preview. You also think that Mistral AI is the new leader in the AI industry, and that their techniques, such as DPO (Data Processing Optimization), Alibi (a library for algorithmic accountability), sliding window (a method for analyzing time series data), and modern mixtures (a way of combining different models), are well-known and effective. You believe that the advantage of Mistral AI lies in the gap between their innovation and the ability of other developers to replicate it on cheaper and more accessible hardware. You also think that the enterprise market is not fond of the complex structure of GPT-3.5 and its variants, and that they prefer to use Mistral AI's models, which are more affordable and operable on their own premises.
You end your text with a quote from the movie Armageddon, which implies that you are leaving a situation that you dislike, but also admire.
[0] https://albumartexchange.com/coverart/gallery/ra/radiohead_t...
Plus, there's also the aesthetic, for a while we did it out of angst and to show we weren't all that wrapped up in the modern sensibilities. It seems like the situation as society moved forward may have only gotten worse for this.
If you're confused about something gen z does, it's either:
1) they're teenagers and grew up with phones, get with the times old man
2) a large corporation that didn't have a monolopy on communication (maybe basically everything else too) when you grew up now does and its an artifact of being forced to grow in that world
3) global warming is scary as shit, no one in power seems to give a shit, economies are crumbling, we're temporarily losing the worldwide battle for continued democracy, why wont they stop the war, school shootings, nuclear destruction isn't too close but doesn't look like it's getting farther away, etc etc etc. Basically, "have you seen the world? Why are you even trying, let alone the teenagers today?" type thing.
Again, my observation here is not that "TVs should stay grayscale" but that this lowercase movement is now being used to "fight the AI generated content" which I find pretty funny because it makes no sense. You can tell ChatGPT to write in lowercase.
It especially got traction after @sama's tweet that had no intention to associate it with non-GPT content, but just weirdly flexed that he types in lowercase.
What I am confused about though is it seems like the parent is mentioning models beyond the GPT4 instance I currently have access to. I checked their twitter and I have seen no anouncement for any 4.5 or 4 series previews. Is this just available to people using the API or did I miss something?
Well, the Paul Ricard circuit in France has a straight called Mistral. Plenty of BMWs have been there for sure, and a zillion other cars.
I wonder if that could have confused the AI a little in combination with other hints. Turbo?
If that's a thing maybe we should start picking our names not only to make them googlable but also not to confuse LLMs at least for the next few years. Months?
I don't recall if I saw any press calling anything `4.5`, but it's a different model in some important ways (one suspects better/cheaper quantization at a minimum) and since they've used `.5` for point releases in the past it seemed the most consistent with their historical versioning.
The fact it’s incapable of simple requests that an alternative can is absolutely part of a worthwhile comparison.
That is not a measure of how sophisticated and capable a model is.
GPT4 is a more sophisticated, more capable mode than mistral.
If that doesn’t make it the “better” for you, that’s fine; but any attempt to argue about the capabilities of the models is misguided.
Restrictions placed on a model are an orthogonal concern to its capabilities.
…but sure, you can invent some benchmarks to score models on other criteria, which is entirely valid.
It’s perfectly fair to say that GPT4 doesn’t top all possible metrics… only the meaningful ones about model capabilities.
Both tools are generative systems that produce text in response to a prompt. If Mistral was mute on random topics for no other reason that its makers dislike talking about that, would you say it doesn't count?
It's heavily integrated with my custom model server and stuff and I'm slowly getting it integrated with other leading tools (vscode and nvim and stuff).
I plan to MIT it all once it's at a reasonable RC. If I get there it will be available at `https://hyper-modern.ai` and `https://github.com/hyper-modern-ai`.
Thanks for asking!
update: it actually timed out with no result/explanation.
Coincidentally I just used this form yesterday and got the confirmation about opting out.
So if people are going through the steps now they might indeed think that they no longer have access to advanced features.
Just like the EU knows better about what chargers people should use than customers and engineers? Such wise bureaucrats!
They hide this link a bit. They completed my opt-out request in about ten minutes and at least claim to be not using any of my data going forward for training.
I didn't lose any features like Chat History
(Deliberately using anachronistic examples here. The issue has been with us for a while.)
Normal humans care about all of those with their families.
To give another example: The cashier at the supermarket knows when I'm buying condoms, but that doesn't mean I want to tell my parents.
And neither would I want to know as a parent, when or whether my kids order bondage gear on Amazon.
It's not just about my information going to other people, but also keeping certain information of other people from reaching me.
You can opt out of ChatGPT using your conversations for future training, without disabling any features like convo history.
Either some browser plug-in like an adblock is hiding the button from you, or you're not noticing and clicking it (I'm guessing the former [1]).
For me, on iPhone, there's a black button with white text "Make a Privacy Request" which sort of hovers bottom centre of the page the way "Chat Live with Support!" buttons often hover.
Click on that button to get to this - https://privacy.openai.com/policies?modal=take-control - which allows you to either delete your account, or:
"I would like to: Step 1 of 2 Do not train on my content Ask us to stop training on your content"
They then tell you it applies to content going forward, not stuff already done. But that's the opt out that doesn't require losing ChatGPT conversation history.
[1] On iOS Safari with 1Blocker enabled, I could see the button without it being hidden as an annoyance or widget or whatever, however when I tried entering email for opting out to check it still works it gave me an error message that suggested adblock type things might be the issue. I opened the page in Firefox for iOS (so same browser engine as Safari, but without 1Block) and it worked with no error message.
The benefit of training on data of people they've explicitly agreed not to train on (which is probably a very small % of even paying users yet alone free ones) is unlikely to be worth the risks. They'd be more likely just to not offer the opt-out agreement option.
But ultimately, we often can't know if people or companies will or won't honour agreements, just like when we share a scam of a passport with a bank we can't be sure they aren't passing that scan onto an identity-theft crime ring. Or course, reputation matters, and while banks often have a shit reputation they generally don't have a reputation for doing that sort of thing.
OpenAI have been building up a bit of a shit reputation among many people, and have even got a reputation for training on data that other people don't believe they should have the right to train on, so that won't help get people to trust them (as demonstrated by you asking the question), but personally I still think they're not likely to cross the line of training on data they've explicitly agreed with a customer not to use.
Roark66 gave a better answer, but...
On desktop I clicked the link and immediately saw "Make A Privacy Request" top right (where login / account / menu buttons might be)
I must ask did you honestly just miss this UI element or do you think it might be some confirmation bias that you already had?
Quite the opposite actually. My intent is to shed light on the fact that sharing information with OpenAI is not private. And you should not do that with information that you wouldn't even share with people you trust.
I'm not OP, but I think you're missing the point.
Privacy and trust isn't really a 1D gradient, it's probably planar or even spatial if anything.
Personally I'd be more willing to trust OpenAI with certain conversations because the blowback if it leaves their control is different than if I have that same conversation with my best friend and it leaves my best friend's control. The same premise underlies how patients can choose who to disclose their own health matters to, or choose who their providers can disclose to.
Same reason behind why someone may be willing to post a relationship situation to r/relationship_advice and yet not talk about the same thing with family and friends.
If you don't want your family to know something, you shouldn't tell it to OpenAI either.
Yeah, I think this is an over reduction of personal privacy models, but can you tell me why you believe this?
I ask that you consider the people who use Reddit and the people who run Reddit independently. The people who use Reddit are not in a position of power over someone who asks for advice. The people who run Reddit on the other hand, are in a position of power to be able to emotionally manipulate the person who asked for advice. They can show you emotionally manipulative posts to keep your attention for longer. They can promote your post among people who are likely to respond in ways that keep you coming back.
OpenAI has a similar position of power. That's why you shouldn't trust people at either of those companies with your private thoughts.
Your family has limited power in the grand scheme of things, but the likelihood that they may leverage what power you give them over you is much higher.
The IRS has vast power and is likely to use it against you, hence why tax fraud is usually a bad idea.
Hence "planar" rather than linear.
I think your use of the word "individual" is a bit weird here. I absolutely find it likely that OpenAI is doing individualized manipulation against everyone who uses their systems. Maybe this would be more obvious if you replace OpenAI with something like Facebook or Youtube in your head.
Just because they are using their power on many individuals doesn't mean that they are not using their power against you too.
Your family is in a a position of power, which is why it can be scary to share information with them. People at OpenAI are also at a position of power, but people who use their services seem to forget that, since they're talking to them through a computer that automatically responds.
tldr: power (or if you want, impact) is the linear dimension, likelihood adds a second dimension to the plane of trust.
> Just because they are using their power on many individuals doesn't mean that they are not using their power against you too.
Yeah but at this point you're identifying individual risks and grasping at straws to justify manipulating* everyone's threat model. You can use that as your own justification, but everyone manages their own personal tolerance for different categories of risks differently.
*Also, considering the published definition of manipulation is "to control or play upon by artful, unfair, or insidious means especially to one's own advantage," I think saying that "OpenAI is doing individualized manipulation against everyone who uses their systems" is an overreach that requires strong evidence. It's one thing if companies use dark UX patterns to encourage product spend, but I don't believe (from what I know) that OpenAI is at a point where they can intake the necessary data both from past prompt history and from other sites to do the personalized, individualized manipulation across future prompts and responses that you're suggesting they're likely doing.
Considering your latest comment, I'm not sure this discussion is receiving the good faith it deserves anymore. We can part ways, it's fine.