To me it's not clear that it should be interpreted as an improvement: what I read in this summary is that users had to search more and to enter longer queries to get to what they needed.
Users struggle to find the right stuff or stuff that‘s so good they don‘t need do do more queries.
> a 30% rise in maximum query length per user, and a 10% increase in average query length
Users need to execute more complex queries to find what they are looking for.
I can imagine that's why today's apps suck so much as most of the pain points won't be easily caught by user behavior metrics.
One thing Alex from Organic Maps taught me is how important it is to just listen to your users. Many of the UX improvements were driven by addressing complaints from e-mail feedback.
But the opposite is equally possible: a terrible search tool could regularly fail to find what the user is looking for or produce music that they enjoy. In this situation, I can also imagine users searching more, because it takes more search effort to find something they like.
They key is why are users searching. In Spotify's case I imagine that you could try and connect number of searches per listen, or how often a search results in a listen and how often those listens result in a positive rating. There are probably more options, but there needs to be some way of connecting the amount of search with how the user feels about those search results.
And yeah, using nothing other than search volume is probably a bad way to go about it
I thought it was very heavy on jargon. Like, it was written in a way that makes the author appear very intelligent without necessarily effectively conveying information to the audience. This is something that I've often seen authors do in academic papers, and my one published research paper (not first author) is no exception.
I'm by no means an expert in the field of ML, so perhaps I am just not the intended audience. I'm curious if other people here felt the same way when reading though.
Hopefully this observation / opinion isn't too negative.
I am curious -- what would have made it more effective at conveying information to you? Different people learn differently but I wonder how people get beyond the hurdles of jargon.
Usually the best way to learn about things like this for me is to see some actual code or to write things myself, but the lack of coding examples in the text isn't the thing that I find troubling. I don't know, it's just.. like, excessively pointer heavy?
Maybe if you've been in the field long enough, reading a particular term will instantly conjure up an idea of a corresponding algorithm or code block or something and that's what I'm missing.
The intended audience was my team and fellow practitioners; assuming some understanding of the jargon allowed me to skip the basics and write more concisely.
That being said I do find the content difficult to understand, and I think reading the actual papers would be much more enlightening. But it's a great survey of all the things people have done.
It’s not the latest and hottest but super simple to do with LLMs these days and can improve a lexical search engine quite a lot.
Isn't this, like, a sign of what's been happening for the last 20+ years (arxiv, blogs etc.)?
Specially for the smartphones all of your data is on the cloud anyway, instead of just scraping it for advertising and the FBI they could also do something useful for the user?
1. Latency is a major issue.
2. Fine tuning can lead to major improvements and I think reduce latency. If I didn’t misread.
3. There’s some threshold or problems where prompting or fine tuning should be used.
As an example, I gave it 'What is the impact of LLMs on search engines?' and it suggested three alternative searches under keywords, the keyword 'Specificity' has the suggested question 'How do large language models (LLMs) impact the accuracy and relevance of search engine results compared to traditional search algorithms?'
It's a really cool trick that doesn't take much to implement.
I'll offer my take as an outside observer. If someone has better insights, feel free to share as well.
In market terms, I think it is because Google, Microsoft and Apple are all still trying with varied success. It has to be them because that's where a big bulk of the users are. They are all also public companies with impatient investors wanting the stock to go up into the right. So, they are both cautious about what ship to billions of devices (brand protection) and cautious about "opening up" their OS beyond that they have already done (fear of disruption).
In technical terms, it is taking a while because if the tool is going to use LLMs, then they need to solve for 99.999% of the reliability problems (brand protection) that come with that tech. They need to solve for power consumption (either on edge or in the data centers) due to their sheer scale.
So, their choices are ship fast (which Google has been trying to do more) and iterate in public; or partner with other product companies by investing in them (which Microsoft has been doing with Open AI and Google is doing with Anthropic, etc.).
Apple is taking some middle path but they just fired the person who was heading up the initiative [1] so let's see how that goes.
My two cents.
[1] https://www.reuters.com/technology/artificial-intelligence/a...
And to a certain extent for the Microsoft cloud experience as well: https://www.theverge.com/2024/10/8/24265312/microsoft-onedri...
I'd assume most people organise their files so that they know where things are as well.
This only work if you remember specific substrings. An LLM (or some other language model) can summarize and interpolate. It can be asked to find that file that mentions a transaction for buying candy, and it has a fair chance to find it, even if none of the words "transaction", "buying" or "candy" are present in the file, e.g. it says "shelled out $17 for a huge pack of gobstoppers".
> I'd assume most people organise their files
You'll be shocked, but...
I was hoping an LLM would have a context of all of my content (text and visual) and for the first time use my computers data as a knowledge base.
Queries like “what was my design file for that x service” ? Today it’s impossible to answer unless you have organized your data your self.
Why do we still have to organize our data manually?
Most people I see at work and outside don’t care and they want stupid machine to deal with it.
That is why smartphones and tablets move away from providing „file system” access.
It is super annoying for me but most people want to save their tax form or their baby photo not even understanding each is different file type - because they couldn’t care less about file types let alone making folder structure to keep them organized.
And do not forget the incredible of number of actual humans FAANG pays every day to evaluate any changes in result sets for top x,000 queries.
Most of these papers are specialized increments on high baselines for a primarily commercial problem. Likewise, they focus on optimizing phenomena that occur in their product, which may not occur in others. Eg, Netflix sliding window is neato to see the result of, but I rather students user their freedom to explore bigger ideas like mamba, and leave sliding windows to a masters student who is experimenting with intentionally narrowly scoped tweaks.
Most of these papers are specialized increments on high baselines for a primarily commercial problem. Likewise, they focus on optimizing phenomena that occur in their product, which may not occur in others. Eg, Netflix sliding window is neato to see the result of, but I rather students user their freedom to explore bigger ideas like mamba, and leave sliding windows to a masters student who is experimenting with intentionally narrowly scoped tweaks. At that point, to top PhD grads at industrial labs will probably win.
That said, recsys is a general formulation with applications beyond shopping carts and social feeds, and bigger ideas do come out, where I'd expect competitive labs to do projects on. GNN for recsys was a big bet a couple years ago, and LLMs now, and it is curious to me those bigger shifts are industrial labs papers as you say. Maybe the statement there is recsys is one of the areas that industry hires a lot of PhDs on, as it is so core to revenue lift: academia has regular representation, while industry is overrepresented.
(the reasonable way is embedding search, which runs much faster with some precomputation, but you still have to store things)
Is this because you want it to continuously watch for live data that could match your need?
If I go through my current tasks and see, that for some task I need a set of documents, emails, .., why cant I just prompt the system to get it in 30-ish minutes. But as someone already stated Apple Intelligence is supposed to fill this gap.
Many of us have ongoing problems pending for years - for just "a week", "where do I sign".
It really depends on the task.
>The idea was that he could graft queries in this that he did not expect to finish quickly but which he could let run for hours or days and how freeing it was to do more advanced research this way.
Just run the biggest model you can find out of swap and wait a long time for it to finish.
You'll obviously see more focus on smaller models, because most people aren't willing to wait weeks for their slop, and also don't have server GPU clusters to run huge models.
This kills the SSD
That whole thing can be simplified to: compute and store embeddings for docs, compute embeddings for query, find most similar docs.
i really believe that this is not an actual problem in need of solving, but instead creating a tool (personal ai assistant) and trying to find a usecase
Edit0: note to self, rambling - assuming there exist valuable information that one needa to access in their files, but one doesn't know where it is, when it was made, it's name or other information about it(as you could find said file right away with this information).
Say you need an information for some documentation like the C standard - you need precise information on some process. Is it not much simpler to just open the doc and use the index? Then again for you to be aeare of the C standard makes the query useless.
If it's from something less well organised, say you want letters you wrote to your significant other, maybe the assistant could help. But then again, what are you asking? How hard is it to keep your letters in a folder? Or even simply know what you've done (I surely can't imagine forgetting things I've created but somehow finding use in a llm that finds it for me).
Like asking it "what is my opinion on x" or "what's a good compliment I wrote" is nonsensical to me, but asking it about external ressources makes the idea of training it on your own data pointless. "How did I write X API" - just open your file, no? You know where it is, you made it.
Like saying "get me that picture of unle tony in Florida" might save you 10 seconds instead of going into your files and thinking about when you got that picture, but it's not solving a real issue or making things more efficient. (Edit1: if you don't know Tony, when you got the picture or of what it's a picture of, why are you querying? What's the usecase for this information, is it just to prove it can be done? It feels like the user needs to contorts themselves in a small niche for this product to be useful)
Either it's used for non valuable work (menial search) or you already know how to get the answer you need.
I cannot imagine a query that would be useful compared to simply being aware of what's in your computer. And if you're not aware of it, how do you search for it?
> "get me that picture of unle tony in Florida" might save you 10 seconds instead of going into your files and thinking about when you got that picture
I don't have a memory for time, and I can't picture things in my mind. Thinking about when I took a picture does nothing for me, I could be out by years. Having some unified natural language search engine would be amazing for me. I might remember it was a sunny day and that we got ice cream, and that's what I want to search on.
The "small niche" use case for me is often my daughter wants to see a photo of a family member I'm talking about, or I want to remember some other aspect of the day and the photo triggers that for me.
I know the context and the content but not the specific substrings in an email I received several years ago.
Here's one of the first things that gemini in gmail actually helped with. I wanted to check when I bought a car seat for my kids, which one it was and how much it cost.
So I knew the rough time it was when I bought it, I know it's a receipt I'm looking for, it's for a child seat, and roughly when. I know the context here.
What I struggled with was finding the exact text that would be in that. There are hundreds or more emails with invoice/receipt/order in. I didn't recall exactly who I bought it from, and there are large numbers of more advertising emails with kids seats in.
I couldn't easily find it, because the actual email I wanted did not say child seat in it. It had a brand and other information, but nothing in the text had a substring I was searching for. I might have found it with "booster seat" but I didn't think of that exact phrase at the time.
Instead I asked gemini to find it. That can then trawl through a bunch of emails and find things that mean but do not say child seat.
Enjoy your day.
The only way is to use the product yourself and honestly engage with it. Stats can't answer this question.
Nin, NIN, nine inch nails, Trent Reznor
VS
Nin, pantera, nail bomb, muse
This should be easy to differentiate, with a "[someone's name] distance algorithm" or such, right?