Kagi raises $670k(blog.kagi.com) |
Kagi raises $670k(blog.kagi.com) |
The problem is, I really don't want to be searching using an account that's linked to my credit card and thus my legal identity. That's a massive possible breach in privacy. And yeah, they say they don't log searches or link them to any account, but the thing is, even if I believe they don't right now, the fact that accounts are set up this way means that they very easily could start logging your searches to your legal identity in the future. It's like giving someone your social security number because they promise not to do anything nefarious with it. Maybe you trust them right now, but it's still a bad idea because of the opening it creates. Maybe they end up doing it to comply with some government regulation. Or maybe because it makes it easier to count searches. Or maybe because they start running ads.
Why would they start running ads? Well, because under capitalism, where everything is supposed to grow in profits and size every year so that the shareholders can get their money, just having a source of income outside of ads isn't enough to guarantee a company won't decide to also run ads, as a source of extra income. Why would any capitalist company every turn down an opportunity to make more profit, just because it already has another profit source? Obviously the constant need for growth creates an incentive to vacuum up every possible way to make profit, not an incentive to be satisfied with a stable, reliable source of income that's enough to keep the lights on and everyone employed. That isn't how capitalism works.
I've seen Krita reach HN's front page. Not "Krita 5 just released". Literally just sigifing Krita's existance.
If it's a Google alternative, wouldn't it be difficult because I imagine Google has so much lock in by now? Like I imagine they have special deals with like cloudflare or whatever so that Google's spiders are allowed but spiders from random companies that got less than a million dollars spiders aren't allowed? Is it even legal to webcrawl anymore if you don't have a team of lobbyists stationed at DC and Brussels just constantly pleasuring every politician? Probably they would say you are doing cyberfraud or wirecrime or interstate proxyterror of some kind, whose rules are buried in a stack of hundreds of thousands of pages of regulations?
They are specifically drawing attention to the fact that they are aiming to build a long term sustainable business, not grow quickly and exit. Thats likely to be important to their customers.
They needed some cash for investment, but didn't need to maximise investment to push for rapid growth.
because "news" is something that is "new" (in this case raising a tiny amount) as opposed to what everybody else is doing (raising 100x or 1000x that amount)?
there is of-course an availability and reporting bias (there are countless of small firms raising small amounts in various sectors) but kagi is reasonably known in this audience so the news is interesting
So it's more what they're doing rather than how much they raised.
Welp. Until now their search and their browser were supported by me and people like me, but now they are supported by investors who will want ROI. As history shows, the best way to get ROI is to become free and mine user data. Congrats to the team on paycheck though!
I believe that it stopped a lot of folks from being able to invest in Kagi, since this issue was brought up in FAQ sent out by Vlad in May.
> Q: Can I still invest if I am not an accredited investor?
> A: Not through this round. If we organize a different vehicle in the future (e.g., crowdfunding), we will let you know.
I'm assuming accredited investors reduce the paperwork / liability and are a better plan A. As they secured their funding target with that scope, they don't need to action the plan B of non-accredited investors. That's less energy spent on investor relations and more on the product.
Google will have its lock in but there's a market share of people like me (and many others here on HN) which have been let down by the constant enshittification of Google's search and will pay for an alternative providing the experience we were used with Google some 10-15 years ago.
Kagi is being smart, they don't need to become a multi-billion company, it's a small team (last I've seen it was about 15 people), providing a good enough product to have paying customers. I've been using it since Nov/2022 and been pretty happy to pay the US$ 10/month for a better and more private search product.
You are absoluetly right. This is what this paragraph from the announcement addresses.
“Looking ahead, we are cognizant that when building Kagi, we are running a marathon and not a sprint. Altering entrenched habits in the society, such as the reliance on personal data and even pieces of what makes us human as currency for essential online activities like search and browsing, is a gradual process that will take time.”
Public trading means clear protocol for ROI and it allows to have many small investors who can have as little say as you specify (non voting shares etc). Comes with requirement for a lot of documentation and transparency too. So accountability. If business I like wants to grow and needs money this is the route I want it to take
You don't need to organize fundraising events etc you just do your thing and if you do it well you get extra money.
But of course it is all assuming Kagi cannot profit enough from is users. Which apparently it cannot.
Edit: 300/day is too much, apparently I use about 100/day according to my search history.
The person who is willing to pay for a search engine is also going to be a person care more about search and use it more. My best guess is that 1000 searches per month is right on the edge of what their average user can get by with. It's fair enough, they do need to make money, and I have no problems believe that 300 is about right for the average internet user. It just gives a false impression of your expected cost as a potential customer.
15 years ago, the idea of competing with Google would have been laughable. Not only did they have all the money, their search engine was so incredibly good that it's hard to imagine what a new market entrant could have brought to the table.
Today, Google's results are so hilariously bad that I have no doubt the majority of HN users could easily come up with ranking algorithms that would outperform Google's by a large margin. It's not that the problem is so difficult; the quality of results clearly indicates they aren't even trying.
So please, go ahead and punish Google, Bing, and Co for what they have allowed to happen during the past decade or so.
Google is facing incredible adversaries whose only goal is to subvert their search. These adversaries are extremely skilled, motivated, and effective.
You're probably right that many people can do better than Google, until you get enough traction that these adversaries pay attention to you. And then you'll have the exact same problem Google does.
Don't overestimate yourself. Google/Bing are doing a pretty good job considering what they're facing, instead curse out the SEO teams who are subverting them.
(And no, I'm not saying the major search players are without concern or condemnation, but "search quality" is not one of those issues for which they're directly responsible)
Are they? Google could manually remove the worst SEO sites from the index and like reduce the spam you get for dev queries with like 95%.
It takes time to build up rank and like 10s to remove the site.
There is an extension that does this client side for Firefox. Try it. (I don't remember the name. I only have it on desktop).
The only explaintions are that Google is too dysfunctional to do anything or they are doing it on purpose to increase revenue. Otherwise there is no way the SEO would suddenly become this bad.
Later on you would need cash to scale this business, but for early stage product development I think $500k to $1 million is the sweet spot.
They may have access to talent pool willing to work for their vision at significantly reduced rates, but unless they effectively sell themself to big cloud provider, they can’t significantly reduce the infrastructure cost.
Don't free search engines have a competitive edge because for every search performed, the clicked result better informs future search results.
As such, the more users searching - the more it improves the search engine results quality.
And since Kagi is a paid offering, they will inevitably have less searches performed - leading to lower quality search results?
As a user, I've found Kagi is much better than free search engines. With DuckDuckGo, I often found myself reverting to Google. With Kagi, almost never.
But are they depending on results from Google, Bing etc. or are they crawling the web themselves.
[1]: https://apps.apple.com/us/app/xsearch-for-safari/id157990206...
The only options for me is: - Google - Yahoo - Bing - Duckduckgo (my current option) - Ecosia
ref: https://help.kagi.com/kagi/getting-started/faqs.html#why-doe...
Eh... This is a tall order when competition is 'free'. Anyways, I wish them good luck as competition is inherently good.
To be clear, I don't think Kagi will be next google. But I do think they have a chance to survive if they don't ruthlessly over extend their scope.
Specially so for technical queries or prior art queries it really feels like a 2007 google crawler, which is actually a complement.
Congrats and good luck to the team!
Weirdly, this seems to be an entirely different thing. I guess the name and/or domain got sold to a completely distinct operation.
https://www.akamaidesign.com/studio/
UPDATE: his web design and hosting service was very advanced for its time!
The real magic happens when you stick the two together. Let traditional search find the relevant documents, and then interact with them through a LLM. This isn't a shortcoming in model tuning or context window size.
The way I see it, recent AI improvements make the future brighter for new search engines. In a gold rush, there are two types of winners, you can win by beating the rest to the gold, or you can win by being the guy selling maps and pickaxes. Traditional search indices fall squarely in the second camp.
AI actually opens new avenues for profit for traditional search, since while it's notoriously difficult to monetize an internet search engine, suddenly you can make ends meet selling API access to AI start-ups.
There's a gap in the market for a search engine whose incentive structure doesn't cause it to become corrupted.
Now I am worried.
ChatGPT is the opposite of teenage sex: everybody is doing it, but nobody is admitting to it.
I understand that it’s a paid service but you’re destroying conversion rate, and there’s no way there are enough countervailing benefits to gating your service behind sign up.
Do you think they don't have a huge team doing just that? Easier said than done.
> The only explaintions are that Google is too dysfunctional to do anything or they are doing it on purpose to increase revenue. Otherwise there is no way the SEO would suddenly become this bad.
This is not based in reality.
How can the Stackoverflow mirrors, that kinda hide that they are mirrors, rank so high then? If there was any team at Google doing manual review, the Google engineers would spam them with mails complaining that those site ruins their copy-pasta coding flow.
But then I saw the “K” and grinned. This is proper bootstrap money, and makes me hopeful that Kagi will stay close to its original mission.
Congrats. I am really pulling for you, Vlad.
By the way, to "bootstrap" a business is short for "pulling yourself up by your own bootstraps" (yes, it's an old phrase) and means not raising any money.
Just a to bootstrap a computer is to start it when it has no code loaded or running. Nowadays it's in ROM so the term is rather obscure but we used to toggle a little bootstrap program in from the front panel or, if you were lucky, load it in from paper tape. That program was very short, just enough to get some more code loaded and start it.
Today I learned! Technically Kagi was bootstrapped up to this point (I've invested $2MM+ of my own money to get us this far). This is the first external fundraise (as noted in the annoucement) and what makes it very special to me is that people participating in the round are Kagi users who were already a part of our journey.
Same impression here
If you have the money to participate and lie about having/earning more money then you qualify.
There is no constitutional way to prevent you from investing based on net worth, so the law places consequences on the companies offering investments.
And the only time this affects the company is if they were relying on a regulatory exemption that also allows unaccredited investors, in which case you should have just participated as an unaccredited investor and they were supposed to verify.
Otherwise, the law says “self-certify”, which is the state sanctioned term for lying.
Investment = you are not personally liable and no need to return the money if u fail. It's not ur money.
Bootstrap = you are responsible to pay it back.iys ur money once u get it.
Same here!
I think this is because our eyes-brain system is optimizing the read operation by scanning the words and maps them against a list of words we already familiar with, thus, sometimes it got the wrong meaning. This trick was used by some brands to mislead people to buy some products with a familiar brand name.. like “Adibas” and “Reedok”.
> makes me hopeful that Kagi will stay close to its original mission
I really hope so but no one can guarantee this would be the case if the company get bigger with more money being thrown there.
In its first years, Google’s tagline was “don’t be evil“ but they couldn’t deliver the promise.
OpenAI’s original mission evaporated as soon as ChatGPT became a thing.. https://news.ycombinator.com/item?id=34979981
Huh?
Now let's go and blow that money in Vegas.
/s
Search is really important to me, and it isn't just about looking up businesses or Wikipedia pages. I read lots of academic papers and just want it to be easy to separate the wheat from the chaffe. Kagi makes that really easy with its lenses. Sure, Brave search has something similar, but I like Kagi's interface better and having to pay them gives me the sense their service won't just be deprecated or significantly changed at the whim of a CEO trying to satisfy advertisers or whatever. Having insight into how many ads and trackers on a page is nice too, as well as being able to demote and block certain domains. It's nice that I can block a domain and it's also blocked on my phone.
What's especially important to me is the ability to wrap my queries in quotes for literal string matches. Granted, no search engine is perfect at this anymore because the web has become so huge, but Kagi seems to be the only search engine I use that gets string matches right more of the time and doesn't give me fake results when there are no actual matches. That's what pisses me off about The Google. And though I once used DDG, I found that the quality of search started getting worse and now it seems like quotes barely even work with that search engine now.
I hope Kagi succeeds. It doesn't need to compete with The Google. Just be good enough for those willing to pay for it.
I did a search (on kagi) and all the top results wanted me to sign up or subscribe. The joke I wanted to make wasn't worth it for that.
I was reminded of google in present times, loads of results that are almost what I'm looking for, but filled with frustration instead.
I don't feel entitled here, but I do know that in the old web, back when I fell in love with google, I'd have found some hobbyist site that did this without trying to get anything from me.
Anyway I then clicked "non-commercial" and was instantly teleported to the old web, the first result was a free simple tool from some hobbyist.
I also didn't experience a huge increase in search quality that I noticed. I didn't really get into the Lenses stuff even though it seemed cool because I just don't think about things like that when I'm searching. I guess 25 years of googling has conditioned my way of thinking about search.
Kagi feels like something I should really like and I'd be fine with paying for it but I guess I need some more "tips" about how to get the most out of it and/or how to change my way of thinking about this.
One thing I started to use a lot since they removed the usage limit is their summarizer.
I made a little bookmarklet to quickly see a summary of the current page I'm browsing:
https://gist.github.com/Julioevm/68275ea1324046caedfdfb2ba0e...
Nevermind, I'm dumb. Just create a new bookmark and post the raw contents of the file into the URL field.
With 670K?
Cuil went through about $30 million to develop a standalone search engine. And that was fifteen years ago, when search was simpler.
There may be a market for a search engine company that profitably runs a low-cost operation with very few ads and makes real efforts to keep out spam. Everybody is fed up with Google and Bing. There's a great opportunity here to disrupt the industry and destroy a trillion dollars in market cap.
But not for $670K.
- SAFE note round with a valuation cap of $40MM (pre-money)
- 249 investors maximum
- We are looking to raise around $2MM from our users in this way ($5MM max)
- Minimum investment amount per investor is $5,000, and maximum is $500,000.
So the $670k they now raised are only about a third of what they were aiming for.
When you raise money with an SAFE, you are postponing the negotiation about how much your company is worth. It could easily convert at well under $40M depending on how well they are doing when they next raise.
Very happy to see them raising, and very happy to give them my $10 each month. Money well spent.
Kagi has successfully raised $670K in a SAFE note investment round, marking our first external fundraise to date. This was made possible with the participation of 42 accredited investors, most of whom are actual Kagi users.
That is a dense paragraph.1) What is a "SAFE note"? Google tells me:
A "Simple Agreement for Future Equity" note is a way that startups can raise capital. The SAFE note is a legally binding agreement that allows an investor to buy a specific number of shares for an agreed-upon price at some point in the future, usually when the startup has a subsequent funding round.
2) What are "accredited investors"? Let's assume US.SEC tells me: https://www.sec.gov/education/capitalraising/building-blocks...
Net worth over $1 million, excluding primary residence (individually or with spouse or partner)
Or: Income over $200,000 (individually) or $300,000 (with spouse or partner) in each of the prior two years, and reasonably expects the same for the current yearWhat Google didn't tell you is that SAFEs were invented by YC ten years ago in order to provide better angel/seed investing framework than the then-current bespoke convertible notes.
You can read all about it here: https://www.ycombinator.com/documents/
"Accredited investor" is basically someone who is supposed to be sophisticated enough that they won't first give you (a startup) some money, then later on say they didn't know that your startup was risky and sue you. Accepting investments only from accredited investors is a way to shield yourself from that scenario.
Both terms are used fairly often on HN (and really throughout the startup communities anywhere).
Google is generating more revenue per "free" user than Kagi lowest paid plan.
Goes to show just how much money there is in advertising.
https://www.statista.com/statistics/306570/google-annualized...
On the other hand, it did teach me the true cost of searches, and I use a lot more !bangs now, like !mdn or !archwiki (although the search powering both of these are probably funded by donations)
It really does change the dynamic of how you search, if you know that the next search will cost you 1.25¢. Right now I use Ecosia, so I'm actually more motivated to do extra searches, to increase the number of trees they plant. Kagi is the other way around, search has a cost, so be mindful about what you search for. It's a great way for them to have a sense of their load, while also perhaps being ever so slightly more sustainable.
Ecosia and Kagi aren't targeting the same market. Ecosia, for example, isn't as serious about privacy: https://news.ycombinator.com/item?id=25713050
I've also gotten quite a bit of utility out of their API, with just a thin wrapper for querying via the command line [1]. I think I'd still prefer using the Search API directly (I currently use the GPT-enabled API), but that's only available for Teams at the moment [2].
[1] https://github.com/bcspragu/kagi [2] https://help.kagi.com/kagi/api/search.html
I think the (open) web is troubled from a content, quality and funding perspective.
The old web was the volunteer web. People produced useful content and it was interesting (Digg, Slashdot, StumbleUpon and delicious)
I like reading personal tech blogs from the volunteer web. I get a lot of value from them.
I don't want the web to get worse so I subscribe to Wikipedia and at one point medium.
I kind of think the secret to quality is to commission high quality authors.
Kagi is just Bing under the hood with an OpenAI summarizer and a lot of tuning. It is fully reliant on the quality of search results it can fetch from Bing's databases.
I know there exists Qwant (a french company) which claim to have an independent engine, but they keep turning their Bing backend on for a lot of searches. Only truly independent engine from Google/Bing with relevant results that i know of is Brave Search. Not a great company, but they do know how to make a good search engine, and are practically alone in the space fighting against the Bing-based alternative search engine business model.
> Our searching includes anonymized requests to traditional search indexes like Google and Bing and vertical sources like Wikipedia, DeepL, and other APIs. We also have our own non-commercial index (Teclis), news index (TinyGem), and an AI for instant answers.
> Are you affiliated with the legendary Kagi shareware platform?
> No. That Kagi went bankrupt in an unfortunate turn of events. We liked the name and acquired it when we got the chance.
The internet changed over time, of course, and Kagi changed with it. Then something bad happened (I don't remember what, and archive.org is being unhelpful finding the announcement) and Kagi shut down. Then I guess Kee sold the domain, and here we are.
And every time the new kagi.com comes up, I think of Kee.
https://tidbits.com/2016/08/04/kagi-shuts-down-after-falling...
I need a way to pay anonymously. For me, this usually means via Monero. Searches are simply too personal and too habitually unrestrained for me to trust to anyone who can de-anonymize them.
Honestly, in retrospect I'll acknowledge that my intense quest for privacy is more ideological than practical, and it wouldn't likely be a big deal if my searches were leaked or whatnot, but still. I've wanted to pay for Kagi and looked for a way to multiple times, but until I can do so anonymously, it ain't happening.
From https://news.ycombinator.com/item?id=32687071:
> Crystal powers 90% of the Kagi search backend (reminder being Python). Highlights are great performance and concurency handling. Biggest downsides at this moment are compilation speed (does not take advantage of multi CPU cores) and debugging tools.
I think Kagi is focused on a great niche with simple business model. Unsurprisingly this works out well in the long run.
This is money expecting "more of this, just a bit better" rather than "giant exit, who cares what happens to the product afterwards"
Hint: non-wealthy people can become an Accredited Investor by passing the Series 7 or Series 65 (Investment Adviser), the latter of which can be done all on your own. I'm considering the Series 65, because it will also give me better retirement investment options as a US citizen (and taxpayer) residing in the EU.
I was thinking the same, but then again, the investment is rather low, so it is a lot easier to dump that investor, if values don't align in the future. I might be reading it all wrong, but it also seems like the money comes from 42 different investors, making it even easier to part ways.
G search is right out for me because I avoid any G products whenever possible. Also it's quite bad these days. So I primarily use DuckDuckGo, except it's honestly not that good either. And I'm very suspicious of how much money they have for advertising with a free, privacy-focused product. It just doesn't feel sustainable. I could be wrong! Maybe their non-personalized ads are more sustainable than Kagi's subscription model. But the search kind of sucks. Actually, sometimes I do end up using `!g` to improve my results (rather: to actually get relevant results), and it feels bad.
So I'm going to switch over to Kagi and see if that can deliver the results I want/need. If so, I'll happily pay. I might even pay $25/mo... it's a lot for a subscription, but it's not a lot when compared to my monthly income, so if it makes me job easier it will feel like a fair trade.
"Goggles enable any individual—or community of people—to alter the ranking of Brave Search by using a set of instructions (rules and filters). Anyone can create, apply, or extend a Goggle. Essentially Goggles act as a custom re-ranking on top of the Brave search index."
I don't think the search quality is anything special. I have never figured out how to use Lenses, and I don't even care to try them again. I like it for privacy, and I like that you can give weight to different websites, and even block ones you don't want to see again. I know there are Google plugins for blocking sources as well.
I don't have any tips. I think Kagi is about as good as DuckDuckGo, and the advantage it has is that it costs money. I'm at the point where I don't even trust a nominally privacy-focused search engine like DDG not to drift into the unethical. For some reason—maybe naivety—I think a company that takes money in exchange guaranteeing my privacy is theoretically more reliable, if only because it would be too audacious to outright lie about that.
Jun 2023 1171
May 2023 822
Apr 2023 2451
Mar 2023 3700
Feb 2023 3632
Jan 2023 5664
Dec 2022 1245
I found that most of my searches are in my phone actually.For me, I notice better search results than Google (not night and day, but noticeable). The ability to block/promote/demote certain sites is fantastic, and I like the URL rules you can create. For instance, I've set Reddit links to open in old.reddit.com, and I've set Youtube links to redirect to an Invidious instance.
It's got some other nice things—I like the universal summarizer a lot, and the search result summarizer has come in handy many times. But mostly, I just like using something that isn't Google without feeling like I'm compromising (which I did feel with both DDG and Brave).
On the other end, I’m seeing Google results get really really bad. Mostly just missing results. When I search the same term on bing there are pages and pages of results. I’ve no idea what’s going on at Google. I don’t use it logged in and don’t know if that affects the results.
Regarding search quality, I don't think it's a revolutionary improvement over Google, but for me it's been noticeable during the time I used it, where as Neeva and Brave have been very noticeable downgrades. I think Kagi shines once you lean into the tools they provide and start upranking and downranking different domains.
Keep in mind it's been a couple of months since I last used it and they've added the search result summarizer and some other AI tools since then that might help tip the balance their way.
The ranking and lenses seemed very cool to me but I burned through my free tier before I could really use them. Maybe I'll try again at the $10/mo price.
Lenses are actually why I subscribed mostly. I search for organic data often while trying to find help for a product or the opinions on a service or something like that. Regular search is ok, changing to the Forum Lense is mind blowing because I only get discussions from real people.
You've literally managed to find the most insignificant thing to worry about in existence.
But if quality is not impressive to you there's no reason to use the service.
> You've literally managed to find the most insignificant thing to worry about in existence.
It's clearly not "insignificant" since others here have voiced the same concern or annoyance. I think we have enough research/studies to show that metering/limiting has an effect on how people use something even if the overage fee is low.
I used the trial and I wasn't sold on it, I was asking for people to tell me how they get value out of it, you're doing literally the opposite and being a jerk about it. If anything your comment makes me want to just forget about Kagi.
* in your opinion, in your circumstances, for your usage patterns.
This investment is a good sign, though. I'm rooting for them. We need more competition in this space, and a proper business not hijacked by advertisers is always a good thing.
Did not have enough time to actually use them though.
Define success? It outperforms every other search engine in terms of quality of results and at least from what I understand, they're not too far off from breaking even on their salaries.
Neeva was a VC funded abomination and it showed, they even had deals with "big brand" companies like LastPass to offer bundles, but their core product sucked. Neeva search results were often worse than Google. That's not the case with Kagi.
Its not hard to define success. It means having more revenue than their cost, and paying all employees market rate salary(including the founder himself).
Their search engine is also better than google for me for most use cases already, it has a clean and smart design with well thought out features that add value.
I hate that Orion is not cross platform and limited to apple ecosystem.
With that said it is native and my default choice for macOS and iOS.
No telemetry, ad blocker built in, performant and supports extensions from chrome and firefox - on iOS too.
Hands down the best native tree style tabs of any browser on the market on top of that and easy tab syncing.
It’s still in development but it’s stable enough for me to be my main browser.
These two products show some serious technical ability from a small team. I bet on companies who invest in and show technical ability over those with VC money, big names focused on marketing and growth.
I used DDG for years, but more than half my queries ended up being prepended with !g because their results just weren't very good. With Kagi, I fall back on Google maybe once a month, and usually Google doesn't end up finding anything better.
The absolute worst way you can arrange a search engine project is as some sort of manhattan project with a humongous budget and an army of professors and experts.
History is littered with bold and ambitious Google killers that went nowhere.
You can throw almost any amount of money at an operation, and it will gobble it up. I think search in particular is very prone to bloated R&D budgets and various forms of mission creep.
Yet the underlying reality is that software development scales very poorly with organization size, and the larger your organization is, the harder it is to steer and the more difficult it is to make the right calls. A squad of 3-4 motivated and talented guys is absolute peak get-shit-done.
A small operation in terms of number of developers yes. But not being able to subsidise user growth means they will never be able to build their own full web index as the fixed cost of doing that is too high for a small number of users.
As a consequence, they will always be at the mercy of Google/Bing. The range of things they can innovate on will always be limited. And the situation can only get worse as people start to expect more AI functionality.
I doubt that you can build a sustainable niche product if the effort you have to put into it is just as big as if you're building for billions of users. Having few users is not what defines a niche. Niches are defined by specialisation.
They're focusing on 2 things: a search engine AND a Mac-only web browser.
What are some examples of those?
There's a lot of wasted effort in this industry. Slack has 1500+ employees just to make a chat app. Granted, it's a damn good chat app, but Mozilla is maintaining a browser with half as many people, with not everyone focused on Firefox at that.
My friend is now in a project where he billed for three weeks until all the issues with his dev account were sorted out. It remains unclear when he will be able to start actually working.
I spent two years building a web app + gRPC server that perhaps could have been just the latter.
I could go on. Point is, you can blow through $30mln easily but that doesn't mean you have to.
They aren't building a web crawler. Kagi is a search "client" (it is one way to build on a shoestring budget, alright), augmented by an in-house small-scale just-in-time crawler.
> From here, we take your query and use it to aggregate data from multiple other sources, including but not limited to Google, Bing, and Wikipedia, and other internal data sources in order to procure your search results. https://help.kagi.com/kagi/privacy/privacy-protection.html
> We also have our own non-commercial index (Teclis), news index (TinyGem), and an AI for instant answers. https://help.kagi.com/kagi/search-details/search-sources.htm...
I like Kagi's general vision for LLMs + Search. There's a real chance small competitors can compete with Google with clever use of LLMs' zero-shot summarization, categorization, intent recognition, and answering abilities.
Especially when google continues to be worse off year after year. Just search for my damn keywords, like you did 10 years ago, especially when I have already put quotes around them because you Re useless!
As someone who has already replaced Google with Kagi, as far as I'm concerned they've already done it.
I've never paid Google for anything (apart from with my data) but I've been paying Kagi for almost a year now. The model is different. From what I've seen so far, it's better.
The amount of money you burn through is not a good indicator of whether you'll end up with a good product or viable business at the end, IMHO
I had no idea how intrusive Google was. I love Google services. Google Cloud is far better than the others. That said, the search is just poison.
I now spend time looking at results, not searching for results.
I have to admit though, ChatGPT did make me consider dropping the subscription. Mr Chat has weakened the value of Kagi to me.
They have their own AI summarizer engine though: https://kagi.com/summarizer/index.html
Per their own documentation, Kagi servers are hosted on Google Cloud.
I felt the same, but I do still need web search, and I do still want it to be as good as possible, so it's still worth it to me.
And actually, as I type this I'm less sure about how much ChatGPT has reduced the value of web search for me. I start with ChatGPT for almost everything that's not "news". But I still frequently do web searches branching out from what I learned via ChatGPT. It's fewer total searches than I've done before, but each one might be more valuable. And since I'm flowing from one tool to another, not having to wade through ads and bad ui and SEO spam also becomes even more valuable.
We beg to differ on the need to raise large sums. Like Kagi we are on a marathon not a sprint. We have built a no-tracking and completely independent crawler search engine and infrastructure from the ground-up having raised £3m from angels only. Cuil, Blekko, Quaero and Neeva who raised 10s/100s of $m may have come and gone; meanwhile we have been slowly building since 2004, with a user and API customer base that is also growing healthily.
All they need to do is nail enough value add for those users. The wider goal of disrupting Google/Bing is a different game. But getting a small company to 10K paying users might be doable given enough of a value add.
Of course they are based in the Bay area, so this kind of money doesn't have a huge runway there.
* For core search, that is, obviously there's vast slabs of services like maps, translate, etc etc where they're not even trying to compete.
This changed with ads. Suddenly search engines were used to find people searching.
It's more about curation (and I guess some machine learning) than completely new tech.
So long story short, my opinion is the exact polar opposite of what you said.
> We are not affiliated with the legendary Kagi - the shareware payments platform. … We liked the name and acquired it when we got the chance.
What? Of course there is. Low net worth isn’t a protected class. We don’t do it that way because the purpose of the law is to protect small investors.
If a fine or sanction from the government was applied to the investor themselves based on net worth, I would argue 5th amendment and 14th amendment equal application of laws, and 1st amendment's freedom of speech citing the Citizen's United case where money movement is speech
I think that would be resilient in most circuits and certainly scotus
Zero right now (we dropped Bing completely after their API price hike). We need to update the documentation to reflect this, thanks for pointing it out.
We as humans tend to trick our own minds into worrying about insignificant things and sometimes need somebody to help us snap out of it. I was trying to help you with that, but instead you think I came across as a jerk. It's not worth your time to worry about a couple of cents for a search. I know many people have the same concern and worry as you, it is still not rational.
> I used the trial and I wasn't sold on it
Then there is no reason for you to use it - even if it was free. The only reason to use paid search is to get better quality results. There is no special maneuver to use with Kagi, it's just typing the query into the box like on Google.
If ever there was a moment where horizontal scaling looked promising, this is it.
During my one-month test of Kagi using it for everything on all my devices I did 674 searches according to them.
There's real restrictions, contractually, that Google has on accessing any data the a customer generates, and of which carry hefty fees for Google.
This is one of the reasons on why some Google products never launch with support for Workspace accounts. There's just too much red-tape that a team doesn't want to deal with.
Source and Disclaimer: I was a Technical Solutions Engineer for Google Cloud.
As an example: I'm playing Diablo 4 at the moment and getting info about the convoluted systems can be a chore. For example: I didn't know that I have to do at least a Tier 3 Nightmare Dungeon to unlock Sigil Crafting. None of the usual pages mentioned that. The Forum Lense saved me a lot of time because users discussing it mentioned it quite a lot.
I'm going to go have a coffee.
That's 183 searches per day, assuming 8 hours of sleep that's 11.5 searches per hour, that's roughly a search every 5 minutes. That's a busy month :)
Such a waste. MacOS users might be open to paying (for Kagi in general) because they're used to paying for a bunch of things other OSes get included or as freeware, but still, the market share is small (depending on the source, 10-30%). And even if many of those would enjoy a Mac-native app, there are at least some, like myself, that refuse to use single platform tools. I have a bunch of devices on a bunch of different OSes, I'm not going to use a very special browser on one of them, losing sync, history, muscle memory when switching. That's the reason I can't stay on Arc even if I quite like some of it's goals and structure.
You always need to start somewhere, Kagi is now (just) a 15 person team. There are enough Macs (200M+) and iPhones (1B+) in use to justify selecting this as the first ecosystem to target for the browser. Really it comes down to resource management and allocation in the early stages.
Everyone has their own definition of success. For many, a "successful" tech startup is one that eventually lands a $100M+ buyout, or gets millions of paying customers, or some other metric that's far beyond "simply" arriving at a sustainable business that you (and I) mentioned.
arxiv.org, nih.gov, researchgate.net, *.edu
SAFEs muddy the waters. It's just another thing that already overwhelmed retail investors need to take into account when considering whether to invest.
Agree, but are SAFEs any worse than a normal convertible?
Sure. But Google/Bing has shown lack of interest in innovating for power searchers. I believe they'll continue doing so.
And even so, this space is big, I'm sure there are many niches that would fit a scrappy, subscription based player.
What do you reckon the cost of doing this would be?
Just doing the napkin math for say a Mojeek sized index (couple of billion docs) doesn't seem to justify a particularly astronomical budget.
That is indeed the key question. I tried to find out before commenting but the information I found is very vague. Internet Archive spends millions per year, but their index is updated far too slowly for a search engine. I have no idea what it costs to create a Google-size index.
Do you think the Mojeek index is good enough to compete with Google?
As for the index; this underpins mojeek.com and our API; which customers use for search and/or AI. Common Crawl is ~3.5 billion pages and underpins LLMs. Our index is ~7 billion. Who knows what (else) Google, and Bing, do with their index ;) ?
[0] https://blog.mojeek.com/2023/05/generative-ai-threatens-dive...
As a back-end for something like Kagi, it's sure getting there. Most of what sets Google apart is their exceptional level of user profiling. The actual indexing technology is likely on par with most of their competition.
Of course very little of that enters into API queries.
Assuming they'll need to index 820 billion pages (the number of pages preserved in the internet archive), at 100kb each, and assuming they use a database with 0.3x text data compression efficiency, they'll need at least 24600 TB to store those text data. Assuming $300 per 16TB disk, then they'll need to spend at least $7,380,000 for disk alone. This is a lot of money just for storage and we haven't included stuff like replication and backup, indexing metadata overhead, etc.
You're ignoring that Kagi runs on subscriptions, so this funding is not all there is. If the subscription covers basic user costs (which it should, because they just adjusted the pricing) or at least covers most of them then the 670K can be used for growth.
Isnt that how firefox got popular, techies started to use it and then convinced others to switch. Of Course in the end it was no match against Google marketing for Chrome.
But maybe there's room for smaller search engines with alternative business models to compete with each other. marginalia.nu is another promising startup.
Nearly all of us in the "niche technical audience" are the personal IT support departments of some subset of our family, friends, and their friends. They come to us asking for advice, or to set their computers/phones up, or to fix them after they "caught viruses". However begrudgingly we do that, it puts us in a position of power - they listen to and trust in what we say, and accept uncritically what we do to their machines. And, our interests are mostly aligned with theirs - even if we don't care about particular friend-of-a-friend's happiness, we'll still do them good so they don't have to come back with more issues any time soon.
This is how Google Chrome spread. This is how Firefox still survives. This is how several brands of anti-malware software spread - a mistake that's now difficult to undo. This is how AdBlock Plus became a thing, and how uBlock Origin is now replacing it. All these trends and more, I participated in first-hand. People still remember and follow advice I gave them over a decade ago (which isn't always good - see the anti-malware stuff).
And so is the case with Kagi, to a degree. I'm paid user for 1.5 years now, happy with the service. I recommend it in relevant discussions, I mention it to people who spot it - but since it's a paid product, it is a tough sell with general population. Still, I try to spread the word, like I do with any other good and non-user-abusive tool.
That said, this news makes me somewhat reluctant to recommend Kagi. I'll keep using it because it provides me immediate value for reasonable price, but taking investment often is a Faustian bargain. In this case it's not as clear as with regular VC backing, so I guess we'll see where this goes.
> Nearly all of us in the "niche technical audience" are the personal IT support departments of some subset of our family, friends, and their friends.
How has that influence worked out for you so far? By that logic, all our family and friend circles would be running Linux, using OSS, and be more mindful of their privacy. IME my attempts at convincing others to use the tools that I use has mostly been met with lukewarm response, or even arguments in favor of the tools that they already use. Convincing someone to change their computing habits is not just a matter of being considered an expert in the field.
Google Chrome spread because it had the resources and influence of a billion-dollar corporation, and the marketing budget to reach millions of users. Firefox is barely surviving, and most of it is due to its corporate contracts, not because of its technical audience.
Asking normal people to use Linux and OSS is a bit too much, but I think that Macbook sales have had at least a bit of help by being endorsed by the technical crowd.
Why? Even if they never get close to being as popular as Google, they can still have millions of paying users.
Even if they're telling the truth, if they receive a national security warrant, they'll start collecting that information secretly. Hence my interest in a warrant canary.
- Users can register with any email. We really do not verify or need this apart from having a way to have an account, which we need to authorize payments
- Some users suggested using crypto to obfuscate identity but this is not as straightforward and does not work as good with subscriptions
https://kagifeedback.org/d/493-enable-anonymous-payments-ala...
- The most interesting idea we've seen is using "blind signatures" which we are currently contemplating on
https://kagifeedback.org/d/653-completely-anonymous-searches...
I'm sure you know your users well after so many years in the search engine business, but having read your article I must say I find your approach risky. You seem to be betting on search engines and answer engines continuing to be complementary rather than substitutes.
But we are not the ones making this decision. Users will be making the decision in light of the newly available AI capabilities, and they will be making it with complete disregard for the health of the web, as is their nature :)
The "funny" thing is that big publishers are as happy right now as I haven't seen them in the past 25 years, because it is so completely obvious that chat AIs will destroy the web unless big tech starts making big payments to big publishers. As you rightly say, small businesses and publishers will be collateral damage.
But how do you make sure you're not collateral damage as well?
An LLM to digest results of a classic search index is greater than the sum of its parts. An LLMs that is not permitted to brush up on the relevant literature before answering a question generally doesn't produce very good answers, is prone to hallucinations etc. A pure LLM design isn't even a serious contender in the answer engine space.
So you're looking at maybe 20 bn docs, 4 Kb each. 100 Tb, before compression.
"The Google Search index contains hundreds of billions of web pages and is well over 100,000,000 gigabytes in size."
https://www.google.com/intl/en_uk/search/howsearchworks/how-...
Doesn't mean you have to be as big as Google to do something useful of course.
Google's index is likely very large because they don't have any real economic incentives to keeping it small.
They changed that limit somewhere in the mid-2000s. Just as well, there's some CMS's out there where there's several hundred kilobytes of inline JS and CSS before any body text.
Is that a good way to build an index?
An index is just hashmap of words and list of urls. So you have to parse the page, and add the urls and word frequencies to the list.
In terms of storage is cheaper, in terms of computing power is more expensive.
Hash tables almost guarantee worst case disk read patterns. You use something like a skip list or a b-tree, since that's makes much better use of the hardware, and on top of that allows you to do incredibly fast joins.
The big publishers certainly won't let you do that as they are selling their data to Google, Microsoft, Facebook and whoever else has the money to train a fully fledged LLM, which is certainly not everyone.
A search engine partnering with an answer engine may not send traffic, but the answer engine is a potential customer for the websites the search engines direct them to.
Only indirectly by charging search engines for access to content. It would be an entirely different business model that requires a complex set of agreements between publishers, search engines and LLM providers.
Granted it's not impossible and certainly worth considering if you have search engine expertise but no money to train an LLM.
Yes, absolutely, I didn't mean to imply otherwise. But first you have to figure out what you can discard beyond the HTML tags themselves to avoid indexing all the garbage that is on each and every page.
When I tried to do this I came to the conclusion that I needed to actually render the page to find out where on the page a particular piece of text was, what font size it had, if it was even visible, etc. And then there's JavaScript of course.
So what I'm saying is that storing a couple of kilobytes is probably not the most costly part of indexing a page.
Are there open source projects devoted to this functionality? It’s becoming more and more a sticking point for working with LLMs. Grabbing the text without navigation and other crap but while maintaining formatting and links, etc
For my specific purposes it has always been good enough to apply some simple heuristics. But that wouldn't have been possible without access to post rendering information, which only a real browser (https://pptr.dev) can reliably produce.
The couple of kilobytes per document is the actual storage footprint. Sure you need to massage the data, but that almost entirely CPU bound. You also need a lot of RAM for keeping the hot parts of the index.