Creating AI assistant with GPT and Ruby and Redis using embeddings

Creating AI assistant with GPT and Ruby and Redis using embeddings(release.com)

113 points by erik_landerholm 3 years ago | 25 comments

Mizza 3 years ago |

From a discussion with a friend today..

Are embeddings a hack? Is building out tooling and databases and APIs and companies around embeddings all going to be for naught as soon as there's a solid LLM/API with a big enough context window?

jrpt 3 years ago | |

I don't think the context window will ever be big enough for some use cases. There was a recent paper talking about a million tokens, but that's just the Harry Potter books. Which is amazing, but you know there's going to be use cases that will have more than 7 books worth they want to use. Furthermore, the performance will be better when you don't have to give it all 1 million tokens, but just the most relevant parts of a context.

regiswilson 3 years ago | |

The short answer is that, yes, embeddings are probably a hack in the same way that using bits or short variable names were hacks to reduce memory usage. At some point you are correct: someone would prompt "given <large amount of data>, answer <user request>".

lukev 3 years ago | | |

But embedding-based semantic search can handle arbitrary sized databases. I fully believe context windows are going to grow: I am skeptical they will grow to cover "all your company's documents" or even "the full encyclopedia" sizes.

dmix 3 years ago | | |

It's more than just optimizing for space (which is still going to be important), it's also about using vector databases to seed the data from a wider dataset and translating that into something the AI can use. I mean technically in the far future you could dump a whole database into the 'context' and work off of it, but Vector DBs will fill that role in the meantime and add a memory layer on top of it for future queries.

joe_the_user 3 years ago | | |

I'd say:

Yes - embeddings are a hack:

No - there won't anything like a "real API" unless there's a new discovery or a shift in the way LLMs are constructed. It's not theoretically impossible but there's no clear way to get guaranteed results from present day LLMs, all they do output guesses from their input text (combining prompt text and then user text).

jeremy_k 3 years ago | |

I can't say I'm very well versed in all of this but I was asking my coworkers today about whether embeddings were the way forward or if doing your own training would be more beneficial. Or even yet, could you take an open source model and train it specifically on just your content; would that wield better results?

Expanding context seems like an approach, but if you're trying to get an answer about your company's documentation, why would you need the entirety of GPT-X?

simonw 3 years ago | | |

Every time I've asked this question the answer has been that injecting relevant content into the prompt provides much better results than attempting to fine-tune a model on your own content.

Here's a relevant quote: https://simonwillison.net/2023/Apr/15/ted-sanders-openai/

dragonwriter 3 years ago | | |

The broad general training of GPT-X (and fine tuning on your content) provides context and (loosely speaking, at least) “analytical” ability, search-via-embeddings to inject material into the prompt provide exact recall of specific material, with capacity greater than the context limit.

Analogous, more or less, to a human with general experience (base training), experience with your code base (fine tuning), and the ability to reference the current code base directly (embedding-based search/recall). All three have a role, they are complementary rather than mutually exclusive.

fzliu 3 years ago | |

Even with an incredibly long context window (say, 1M tokens), attention still suffers from a problem with long-term dependencies. This is probably why OpenAI hasn't publicly released their 32k token length model just yet.

Der_Einzige 3 years ago | | |

I think they haven't released it because the capabilities it has are simply too powerful when combined with a vectorDB.

toxicFork 3 years ago | |

Embeddings are useful for sentiment analysis and search in general, but given a "powerful enough AI with enough of a context window" they may be obsolete indeed, if it can do all of those things.

welfare 3 years ago |

That's gotta be a Hacker News bingo if I've ever seen one.

taf2 3 years ago | |

Think we can optimize this with rust

darkwater 3 years ago | | |

And Postgres.

strudey 3 years ago |

Can use https://github.com/alexrudall/ruby-openai to do this sort of thing also :)