The idea that a company is an AI company should be as ridiculous as a company being a Python company. "We are Python-first, have Python experts, and all of our products are made with Python. Our customers want their apps to have Python in them. We just have to 'productize Python' and find the right killer app for Python and we'll be successful!" Going at it from the wrong direction. Replace Python in that quote with AI, and you probably have something a real company has said in 2024.
However, a lot of those got a bunch of investment or made some decent money in the short term. Very few are still around. We will see the same pattern here.
And post-ChatGPT, very few people want to have to deal with "a (more or less) general purpose chatbot."
It's awful and a complete waste of time. I'm not sure if LLMs are getting good use yet / general chatbots are good or ready for business use.
So... certainly there's a space for new products.
...but perhaps for existing products, it's not as simple as 'slap some random AI on it and hope you ride the wave of AI'.
I mean, I still don't. But from a cynical business point of view, cutting customer servce costs (something virtually every company of scale has) of 99% of customer calls is a very obvious application of a genera purpose chatbot.
expand that to "better search engine" and "better autocomplete" and you already have very efficient, practical, and valuable tools to sell. but of course companies took the angle of "this can replace all labor" instead of offering these as assistive productivity tools.
If you can't convince people that this is benefiting them, and instead focus on talking to investors about how much you can kill off the working class (aka, your "customers" and nowadays "product audience"), you will make it harder to properly sell your product nor audience. Companies have forgotten who the real customers are, no wonder their products aren't resonating.
When you’re truly bring novel new value to things, sometime you need to say “we can do this cool thing, but don’t know what that means”. Simply knowing that capability opens you up to better sets of solutions.
Customers are also more interested in AI products. The tech industry has stagnated for years with incremental improvements on existing products. ChatGPT and generative AI are new capabilities that draw interest, and companies have been doing anything they can to stand out today.
Every cycle, theres all types of people hop on board whatever the hype train is... it's the same mindset as pioneering for gold in the wild west.
I just hope we can move along more in the "wheat" direction with AI products. There's so much low-effort crap already out there.
So just zooming out, we need people trying to figure out what can be built with this Lego set. We also need people like you're saying to work the other side so everyone can meet in the middle.
- Bezos saw the growth rate of the internet, spent a few months mulling over the question: "what business would make sense to start in the context of massive internet adoption" and came up with an online bookstore.
- OpenAI's ChatGPT effort really began when they saw Google's paper on transformers and decided to see how far they could push this technology (it's hard to imagine they forecasted all the chatbot usecases; in reality I'm sure they were just stoked to push the technology forward).
- Intel was founded on the discovery of the integrated circuit, and again I think the dominant motivation was to see how far they could push transistor density with a very hazy vision at best of how the CPUs would eventually be used.
I think the reason this strategy works is that the newness of a truly important technology counteracts much of the adverse selection of starting a new business. If you make a new To-Do iPhone app, it's unlikely that people have overlooked a great idea in that space over the last 10 years. But if lithium ion batteries only just barely started becoming energy dense enough to make a car, there's a much more plausible argument why you could be successful now.
Said another way: "why hasn't this been done before?" (both by resource-rich incumbents as well as new entrants) is a good filter (and often a limiting one) for starting a business. New technological capabilities are one good answer to this question. Therefore if you're trying to come up with an idea for a business, it seems reasonable to look at new technologies that you think are actually important and then reason backward to what new businesses they enable.
Two additional positive factors I can think of:
1. A common dynamic is that a new technology is progressing rapidly but is of course far behind traditional solutions at the outset. Thus it is difficult to find immediate applications, even if large applications are almost guaranteed in 10-20 years. Getting in early - during the borderline phase where most applications are very contrived - is often a big advantage. See Tesla Roadster (who wants a $100k electric sports car with 200mi range and minimal charging network?), early computers (what is the advantage of a slow machine with no GUI over doing work by hand?), and perhaps current LLMs (how valuable is a chatbot that frequently hallucinates and has trouble thinking critically in original ways)? It's the classic Innovator's Dilemma - we overweight the initial warts and don't properly forecast how quickly things are improving.
2. There is probably a helpful motivational force for many people if they get to feel that they are on the cutting edge of technology that interests them and building products that simply weren't possible two years ago.
This is the fundamental problem that prevents generative AI from becoming a "foundational building block" for most products. Even with rigorous safety measures in place, there are few guarantees about its output. AI is about as solid as sand when it comes to determinism, which is great if you're trying to sell sand, but not so great if you're trying to build a huge structure on top of it.
It's all well and good to say "Make something people want" but for anything that people want usually one of three things is true
1. Someone else is already making it.
2. Nobody knows how to make it.
3. Nobody knows that people want it.
People experimenting with 2 and 3 will have a lot of failures, but the great successes will come from those groups as well.
Sure, every trend in business has a lot of companies going "we should do this because everyone else is" It was a dumb idea for previous trends and it is a dumb idea now. Consider how many companies did that for the internet. There were a lot of poorly thought out forays into having an internet presence. Of those companies still around, they pretty much will have an internet presence now that serves their purposes. They transition from "because everyone else is" as their motivation to "We want specific ability x,y,&z"
Perhaps the best way to get from "everyone else is doing it" to knowing what to build is to play in the pool.
But AI has always been a secondary augmentation to the product itself. It’s a tool, it shouldn’t be the other way around.
We initially were developing a system that we had hoped could handle everything and eject any workflow issues to a human so the operations team could kick the machine. We were hoping to avoid an interface all together on the customer side.
After a few versions and attempts at building this system, we moved towards a traditional app where we focused on building a product people wanted and automate parts of it over time. But even the parts we automated needed an interface for customers to spot check our work. So we found a great designer.
...Before we knew it, we were building a traditional company, with some AI. The company is doing well and people love what we're building, but it's different than we imagined.
We still believe in the long term vision and promise of the technology, but the article is right, this isn't going to be an overnight process unless some new architecture emerges.
In the mean time, we're focused on helping people get from A to B easily using whatever means necessary, because moving f**ing sucks. If you're moving soon or know anybody who is, we'd be happy to help them. -P
It's okay, I mean even the internet started out as Charlie_Bit_Me.avi and free porn.
I'm betting the adoption curve for AI hits when the first company sells photorealistic porn of anyone you have a picture of. Hell, half ass Photoshop porn is lucrative already.
Charlie Bit Me is of the YouTube generation, so it wasn't passed around as an avi email attachment like some older memes of the previous generation. From that long ago, Exploding Whale comes to mind.
The reason we can build such deep and complex software system is because each layer can assume the one below it will "just work". If it only worked 99% of the time, we'd all still be interfacing with assembly, because we'd have to be aware of the mistakes that were made and deal with them, otherwise the errors would compound until software was useless.
Until AI achieves the level of determinism we have with other software, it'll have to stay at the surface.
We probably need a lot more work along this dimension of finding use cases where strong automatic verification of AI outputs is possible.
Going further, our predecessors put so much work into getting non-deterministic electronics together providing us with a stable and _correct_ platform, it looks ridiculous how people were trying to squeeze another layer of non-determinism in between to solve the same classes of problems.
>If your AI travel agent books vacations to the correct destination only 90% of the time
that would be using the wrong tool for the job. an AI travel agent would be very useful for making suggestions, either for destinations or giving a list of suggested flights, hotels etc, and then hand off to your standard systems to complete the transaction.
there are also a lot of systems that tolerate "faults" just fine such as image/video/audio gen
But that’s a recommendation engine and we have that already all over the place.
Well, I don't agree. I think there are ways to make this successful, but you have to be honest about the limitations you're working it with and play to your strengths.
How about an AI travel agent that gets your itineraries at a discount with the caveat that you be ready for anything. Like old, cheap standby tickets where you just went wherever there was an empty seat that day.
Or how about an AI Spotify for way less money than current Spotify. It's not competing on quality, it can't. Occasionally you'll hear weird artifacts, but hey it's way cheaper.
That could work, imo
AI is creating a post-scarcity content economy where quality is going to be the only driver of value.
If you are the rights holder of any premium human created media content you are not going to let a 'cheap' AI tool get access to recommend it out to people.
I'm not disagreeing with the "needs to work deterministically" -- there is a need for that, but this is a poor example. "Hey robot, plan a trip to Mexico" might still save me time overall if done right, and that has value.
Call centre workers are often dreadfully inaccurate as well. Same with support engineers.
Heck even for banking, there are enormous teams fixing every screw up made by some other employee.
If you're writing a random number generator, that generates numbers between 0 and 100. How would you test it? Throw your hands up in the air and say nope, can't test it, it's not deterministic! Or maybe you can just run it 1000 times and make sure all the numbers are indeed between 0 and 100. Maybe count up the number frequencies and verify its uniform. There's lots of things you can check for.
So do the same with your LLMs. Test it on your specific use-cases. Do some basic smoke tests. Are you asking it yes or no questions? Is it responding with yes or no? Try some of your prompts on it, get a feel for what it outputs, write some regexes to verify the outputs stay sane when there's a model upgrade.
For "quality" I don't think there's a substitute than humans. Just try it. If the outputs feel good, add your unit tests. If you want to get scientific, do blind tests with different models and have humans rate them.
Abyss: 1 Ambiguous: 3 Cacophony: 3 Crescendo: 3 Ephemeral: 3 Ethereal: 3 Euphoria: 3 Labyrinth: 3 Maverick: 3 Melancholy: 3 Mellifluous: 3 Nostalgia: 3 Oblivion: 3 Paradox: 3 Quixotic: 1 Serendipity: 3 Sublime: 3 Zenith: 3
I find it useful for:
* throwing ideas at a wall and rubber-ducking my emotional state and feelings.
* creating silly, meme images in strange circumstances, sometimes.
* answering simple "what's the name of that movie / song / whatever" questions
Is it always right? Absolutely not. Is it a good starting point? Yes.Think of it like the school and the early days of Wikipedia. "Can I use Wikipedia as a source? No. But you can use it to find primary sources and use those in your research paper!"
When I look for answers to specific questions, I either search Wikipedia, or ask ChatGPT. "Searching the Internet" doesn't work anymore with all the ADs, pop-ups and "optimized" content that I have to consume before I get to find the answers.
Its like talking to a intelligent person about a topic you want to learn, but they know it good enough to teach you if you keep asking questions.
a) Write me shell script which does this and that. b) what Linux command with what arguments do I call to do such and such thing. c) Write me a function / class in language A/B/C that does this and that d) write me a SQL query that does this and that e) use it as a reference book for any programming language or whatever other subject.
etc. etc.
The answers sometimes come out wrong and / or does contain non trivial bugs. Being experienced programmer I usually have no problems spotting those just by looking at generated code or running test case created by ChatGPT. Alternatively there are no bugs but the approach is inefficient. In this case point explaining why it is inefficient and what to use instead ChatGPT will fix the approach. Basically it saves me a shit ton of time.
Also: https://www.reuters.com/world/europe/eiffel-tower-grows-six-...
Utility may have been an afterthought, but it's still there.
We should have known that once we pass the Turing Test it would almost instantly become as passe as Deep Blue beating Kasparov on the road to general intelligence.
I am taking a break from my LLM subscriptions right now for the first time to gain some perspective and all I miss it for is as a code assistant. I would also miss it for learning another human language. It seems unsurprising that large language models use cases are with automated language. What is really surprising is how very limited the use cases for automated language seems to be.
Far more useful than simulating a person, OpenAI managed to index so much information and train their models to present it in a compact way, making ChatGPT better than Google search for some purposes. Also, code generation.
It was literally replacing a hierarchical link tree and that almost always was easier to use.
It can be hard enough for humans to just look at some (already consistently passing) tests and think, "is X actually the expected behavior or should it have been Y instead?"
I think you should have a look at the abstract, especially this quote:
> 75% of TestGen-LLM's test cases built correctly, 57% passed reliably, and 25% increased coverage. During Meta's Instagram and Facebook test-a-thons, it improved 11.5% of all classes to which it was applied, with 73% of its recommendations being accepted for production deployment by Meta software engineers
This tool sounds awesome in that it generated real tests that engineers liked! "zero human checking of AI outputs" is very different though, and "this test passes" is very different from "this is a good test"
I use it as a secondary when the other two are chewing on other tasks already.
I only own it as I am an outrageously heavy consumer of LLMs for all sorts of little projects at once and they all seem to pause one window if you use another.
Like even when you are writing code. Describe the solution and ask AI to write it, don't specify the requirements to AI and hope it will write it.
If your 5.1 GHz (billion instructions per second) CPU had a 0.00000001% chance of failing at a given instruction, you'd have a 40% chance of a crash every second.
If a flight had a 1% chance of killing everyone aboard 10 million people/day * 1% = 100,000 people would die every day from a plane.
Newly-"AI"-branded things that I have touched work substantially less than 90% of the time. There are like 3 orders of magnitude difference, even people who aren't paying any attention at all are noticing it.
Software pretty much always "works" when you consider the definition of work to be "does what the programmer told it too". AI? Not so much.
That's exactly my point. You have to interact directly with the A.I. and be aware of what its doing.
They google crap all the time. They're as unaware of things as we are.
The tests they have access to are much better than anything we can get our hands on though.
An expert refreshing their knowledge on Google is not the same as a layman learning it for the first time. At all.
This has happened to me 3-4 times, hasn't sent me wrong yet. Meanwhile I've had doctors misdiagnose me or my wife a bunch of times in my life. Doctors may have more knowledge but they barely listen, and often don't keep up with the latest stuff.
Maybe I'm a low-risk guy but I would never follow a medical solution spit out by an LLC. First, I might be explaining myself badly, hiding important factors that a human person would spot immediately. Then, yeah, the hallucinations issue, and if I have to double check everything anyway, well, just trust a (good) professional.
Pull requests in github is actually very similar conceptually to a consensus mechanism used in crypto currencies. Everyone has an identical copy of the main branch with an identical history of every commit in order, a PR is saying "I think this commit goes next" and, if you use code reviews, the PR approval is consensus.
Have you ever seen a git graph? Does this look linear? https://tortoisegit.org/docs/tortoisegit/images/RevisionGrap...
git is very much a blockchain
- sequential list of changes to a data source (commits)
- single, shared history of changes (main branch)
- users creating potential next change(s) to be added to the history (side branches and forks)
- consensus mechanism for new change blocks (merge requests and code reviews or approvals)
What's missing?
You misery and wasted time is their improved stock price and bonus package.
There are some more advanced ones using ChatGPT now. I'm guessing they simply pre-prompt it. Can lead to funny results like a customer making the Chevy bot implement an algo in Python.
I think the problem is you haven’t shifted your mindset to using AI correctly yet.
Edit: More everyday examples from just the last 3 days
- Use carbide bits to drill into rocks. Googling “best bits for drilling rocks” doesn’t bring up anything obvious about carbide but it was the main thing chatGPT suggested.
- gave it dimensions for a barn I’m building and asked it how many gallons of paint I would need of a particular type. I could probably work that out myself but it’s a bunch of lookups (what’s the total sq footage, how many sq ft per gallon, what type of paint stands up to a lot of scuffing etc.)
- coarse threaded inserts for softwood when I asked it for threaded insert recommendations. I would have probably ended up not caring and fine threaded slips right out of pine.
- lookup ingredients in a face cream and list out any harms (with citations) for any of them.
- speeds and feeds for acrylic cutting for my particular CNC. Don’t use a downcut bit because it might cause a fire, something I didn’t consider.
- an explanation of relevant NEMA outlets. Something that’s very hard to figure out if you’re just dropped into it via googling.
clearly anyone trying to buy a car, which is already an ordeal with a human as is.
>I literally use ChatGPT 30 times a day
good for you? I use Google. mos of my queries aren't complex.
>Isn’t “this not good enough yet” line getting old?
as long as companies pretend 2024 AI can replace skilled labor, no. It's getting old how many more snake oil salesmen keep pretending that I can just use ChatGPT to refactor this very hot loop of performance sensitive code. And no ChatGPT, I do not have the time budget (real time) to hook into some distributed load for that function.
I'm sure in a decade it will wow me. But I prefer to for it to stay in its lane and I stay in mine for that decade.
>There nothing else that can estimate the number of cinder blocks I need to use for a project
is Calculus really this insurmontable feat to be defending big tech over? I'm not a great mathmatican, but give them excel/sheets and they can do the same in minutes.
>I can think of literally thousands of things I have asked that would have taken hours of googling that I can get an answer for in minutes.
I'm glad it works out for you. I'm more scrutinous in my searches and I see that about half the time its sources are a bit off at best, and dangerously wrong at worst. 50/50 isn't worth any potential time saved for what I research.
>I think the problem is you haven’t shifted your mindset to using AI correctly yet.
perhaps. But for my line of work that's probably for the best.
There is an indictment of AI "products" if I ever heard one
It’s like people that kept going to the library even with Google around. You’re not playing to the strengths of AI and relying on whatever suboptimal previous method you used to find the answers. It does really, really well with very specific queries with a lot of looks ups and dependencies that nothing else can really answer without a lot of work on your end.
And do you? Every time someone tried to show me examples of “how amazing ChatGPT is at reasoning”, the answers had glaring mistakes. It would be funny if it weren’t so sad how it shows people turning off their critical thinking when using LLMs, to the point they won’t even verify answers when trying to make a point.
Here’s a small recent example of failure: I asked the “state of the art” ChatGPT model which Monty Python members have been knighted (it wasn’t a trick question, I really wanted to know). It answered Michael Palin and Terry Gilliam, and that they had been knighted for X, Y, and Z (I don’t recall the exact reasons). Then I verified the answer on the BBC, Wikipedia, and a few others, and determined only Michael Palin has been knighted, and those weren’t even the reasons.
Just for kicks, I then said I didn’t think Michael Palin had been knighted. It promptly apologised, told me I was right, and that only Terry Gilliam had been knighted. Worse than useless.
I also usually follow most prompts with “look it up I want accurate information”
I had about 5-10 cinder blocks left over, not bad for an order of ~150
All secondary branches are works in progress that may be proposed as new commits to main.
Sticking with the blockchain comparison, every side branch in got is akin to potential blocks that miners are working on.
My point is that git is a data store involving a genesis block (initial commit), blocks of changes/diff's, tracked in sequential order, and with a form of consensus (code reviews and merges to primary).
What is missing that makes it not a blockchain?
And my caveat here, I can't stand arguments for cryptocurrencies and have never purchased any. Blockchain as a concept is fine, and git is a blockchain as best I can tell.
The point is ChatGPT's wild success doesn't automatically mean consumers want and possibly will never want a chatbot as their primary interface for your specific app or service.
They left room for the idea that the technology could evolve to be useful. You're simply dismissing anyone who cannot use he technology as is as "using it wrong".
As someone who did a tad of UX, that's pretty much the worst thing you can say to a tester. it doesn't help them understand your POV, it builds animosity towards you and the tester, and you're ruining the idea of the test because you are not going to be there to say "you're doing it wrong" when the UX releases. There's 0 upsides to making such a response.
You can look at it in two ways, neither are particularly wrong unless your job is in fact to navigate file systems.
That's true, and they lack that capability. Many people seem to react as though this means they're missing all value, however. I find them incredibly useful; it just isn't possible to get much value out without investing effort myself.
That is partially marketing's fault, so I say that confusion is self inflicted. Because marketing isn't focusing on "make yourself more productive!"