When Apple strikes an exclusive deal with suppliers for parts, it is sound business practice.
When Google strikes an exclusive deal with Reddit, it is ..
Some of you have no idea how businesses work, and it shows.
It's because reddit is selling content created by users, base on promises that reddit supports open internet, open data, etc, without their consent and sharing revenue, which maybe legal but likely not ethical.
The users hold the copyright (reddit claim that they made the meme) but reddit has the non-exclusive right to redistribute and license the content.
Two different things.
that's what I said: it is legal.
Remember, the only reason Reddit "won" was because Digg destroyed itself with a radical upgrade that everyone hated.
Reddit would have to do something similarly self-inflicted, and I can't even guess where people would go. Reddit was already an alternative to Digg -- what's the alternative to Reddit? I mean, it's certainly not Quora.
The main thing I see Reddit being useful for are discussions about entertainment.
There's probably a subreddit for your favorite sports team, twitch steamer, TV show, book series video game, politics (which is entertainment for some people).
Reddit has seriously degraded the experience of a lot of these communities with things like restricting custom CSS.
It seems to me that the way you'd disrupt Reddit as a startup is to pick a vertical and laser focus on becoming the best discussion board for that community. If it's sports than have integrations for live stats, scores, etc.
In general you could attract users by offering profit sharing on ads the same way Youtube does for creators.
Have the best moderation tools in the world, a constant painpoint with Reddit. Give admins more flexibility over the appearance of the board, all things Reddit took away.
The other path for disruption would be if an established company with those communities tackled the problem. Lots of communities already us Discord, but they tend to also have a subreddit because chat and forums are different communication methods. Discord could easily offer a forum product as an extension of their chat services. If they do it well they'd drive a lot of users away from the subreddits.
>what's the alternative to Reddit? I mean, it's certainly not Quora.
If it was deliberate I certainly can't tell, but one of the characteristics of Reddit is that it caused so many other little tiny internet forums to just wither away. Most were visually unappealing, running some ancient phpbb software or whatever, but there were so many like stars in the night sky. Now, if they're even still running, you look for the newest post, and it will say "November 2023". Hell, the only reason they are still running is that the credit card number on file paying for hosting doesn't expire until next year somehow. Reddit is a red tide algae choking out all life in the ocean, nothing else gets to exist anymore.
This site is essentially 'orange reddit', they just need to add sub-HNs or tagging or something and it'd be ready for an influx of reddit refugees. Not that any of really want it, but it's possible.
You don't want to end up banned from a movies forum because you also participate in a political forum. Federation solves that problem because you can use separate accounts without either forum knowing that you also use the other.
Honestly, it's strange to me how hard people are trying to make distributed anything happen. Federation mostly solves a problem that real people don't have or care about.
I would like to see a wikipedia-style system for Twitter/Reddit: open access data, non-profit.
They could monetize it much better while being less annoying.
Ultimately - Google is getting everything they want from Reddit with this deal without having to buy it outright.
Short of Reddit transforming to an entirely different product (difficult) - I'm not sure where the major growth opportunity is for it.
Does this mean there will be a future where everyone is running their own crawler? I suppose.
As soon as someone shows me a search engine that restores quality of searxh, im getting a subscription for work.
It really cany be hard to whitelist sources and index appropiately.
Get goimg nerds , google has fallen.
I remember seeing an unhelpful hyperlink for the first time. It was a random word in the body of a random tech site that redirected to a list of articles from that site tagged with that term.
I remember being stunned, my expectation was that the link would lead me to another website, one that would be an authoritative source on that term and freely accessible.
20 years later we get a paywalled article about fragmented web – and we’re not slowing down.
https://cc.bingj.com/cache.aspx?d=5070227914243&w=ljIRk8yx42...
For example, when I search for product reviews, I always specify reddit. Otherwise the search results are inundated with SEO spam.
Reddit's justification for this is profoundly wrong. Their "public content policy" is absurd doublespeak, and counter to everything the open internet is and hopes to be. You cannot simultaneously call yourself "open" and "public" while refusing access to automated clients. Every client is automated. They even go so far as to say that "crawling" (also known as "downloading") is an "abuse" and violates user privacy.
This is absurd, and not justified. I would love to see legislation that restricted server operators' ability to prohibit automated access in this way, but I suppose it will never happen. Some people in this thread have attempted to justify the policy by saying "they have to protect their income streams". No they don't. You don't have a right to an income stream, and you certainly don't have a right to lie in order to get all the benefits of an open internet with none of the downsides. Noting of course that the "downsides" are in this case actually just "competitors".
Or are you somehow suggesting that it’s google’s fault that Reddit took this step? I don’t see any indication that’s the case.
google is using their power to prevent others from competing.
the problem here is of course that if reddit would be in financial trouble (i don't know if they are but let's imagine they need this money), they'd be between a rock and a hard place.
google should not be allowed to make exclusive deals, and reddit could not survive without the deal, then what would be left? google buys reddit, or the relevant authority approves of the deal?
i thought about the same problem with firefox. let's assume firefox is forced to allow people to make a choice of the default search engine (just like microsoft was forced to allow a choice of default browser on windows) then google might stop paying mozilla, and they could end up in financial trouble.
ideally no company ever depends on a single other company that much, but that only works if we don't allow companies to grow that much in the first place.
https://help.kagi.com/kagi/search-details/search-sources.htm...
>Our search results also include anonymized API calls to all major search result providers worldwide
Incorrect: https://www.mojeek.com/about/why-mojeek
> Searching for Reddit still works on Kagi, an independent, paid search engine that buys part of its search index from Google.
Also things like the API fiasco, and also small annoyances like the fact that when you click on an image on reddit, it now goes to a wrapper html page instead of just the actual image (this was one of the reasons reddit was better than most social media...).
Part of the blame for the redirect-to-wrapper page lies with browsers. If browsers didn't let servers reliably differentiated between a direct request and an <img> embed then this practice would not be as widespread.
Blame Reddit, not Google.
The most recent 10-K financial results 2024-03-31 (filed 2024-05-08) shows they actually lost money: https://www.sec.gov/edgar/browse/?CIK=1713445
(For 2024-Q1, Reddit lost -$575 million on revenue of $242 M.)
If the quoted "$60 million deal"[1] from Feb 2024 is accurate, that small amount from Google may not be enough for Reddit to turn a profit. It remains to be seen what the Q2 or Q3 financials will show.
Then again most of what that site does is just blend and regurgitate the information that's currently on it anyway.
For a while, the internet had an end-run play that made that guy less useful. You can just go on the internet for obscure movie information, buddy.
But now it seems like knowing a movie guy is going to be the only way to get a real person's opinion on movies. The internet is about to forget everything without a profit motive and just start telling you that the latest product from a monolith corp like disney is the only movie worth watching. If someone scrapes all the useful movie opinions off of reddit and spends their time crafting it into a usable format, that guy's probably got a company. But not Bill. Bill's just a guy you can know or not know. You can't monetize knowing Bill. Sidenote that's probably why it irked me so bad when some bozo coined the phrase "social capital".
Startpage, Kagi and Lukol are 3 that source from Google. I imagine there are others.
It'd be somewhat hilarious if google bought reddit just to archive it and shut it down.
It's like what happened to personal websites when things like Blogger, Tumblr and Facebook popped up.
It's hard to beat something that is easy to set up and pays for hosting but still let's you control moderation. It's a no brainer.
Managing your own domain where users post content is a minefield of problems these days even if you didn't mind the cost of running it.
Something like this might also explain the move to things like Discord over IRC.
IMO, something federation is very good at is solving one slow-moving problem - enshittification of social platforms. It's not immune, of course, but an Elon Musk-style takeover is much harder with Mastodon than Twitter, and it would be hard to run it into the ground in other ways because the platforms are owned by different people and groups.
I just read they have 2000 employees which is also puzzling to me
Compare and contrast to say, HN, run on two servers in a colo with less than a handful of mods.
They need to make sure the stuff they want people to think is posted often and has a big number next to it, they need to make sure things that people like are associated with the stuff they want people to like/think/do and things that people don't like are never associated with the stuff they want people to like/think/do. They need to make sure that people who say the wrong things are silenced or persuaded to leave, etc etc. Man they probably have at least one contact in at least one intelligence agency and they have to make sure not to run afoul of that contact.
Like the news isn't just a list of what happened recently, political debates aren't just two guys talking, and reddit/twitter aren't just message boards.
let assume apple is forced to allow people to make a choice of the default search engine in safari then google might stop paying apple, and ...
Firefox has always allowed people to make a choice of the default search engine, since before it was even called Firefox. I know. I was there building it.
if the same was done for search engine choice for firefox then google would no longer be the default, and they would have no reason to pay firefox for that.
And as for Firefox, Mozilla being forced off Google's teat would be the best thing that could happen to the browser.
I assumed he had financial connection to them, but didn't want to take the time to research it. Mojeek is the new fetch.
Is Google's contract with Reddit exclusive, so that other search engines aren't given the opportunity to also pay?
I highly doubt that, especially since the DOJ would go after that immediately because of antitrust.
So no, pretty sure the blame here is 100% on Reddit unless you have evidence otherwise.
> so that other search engines aren't given the opportunity to also pay?
this makes it harder for new engines if google has exclusive deals with some of the most popular sites
My comment said, show me that the Google deal with Reddit is exclusive.
You haven't done that.
And there's no reason to think it would be, because of antitrust. The DOJ doesn't have to act "immediately", the point is that obvious antitrust violations come with fines that make it unprofitable to attempt in the first place. And this would be black-and-white obvious antitrust violation, given Google's monopoly status in search. This isn't a gray area where it might be worth it for Google to roll the dice.
whether or not the deal is exclusive OR companies have to pay to index reddit it's still bad for competition. money has a barrier to entry preventing newcomers.
I can blame reddit for creating the deal and I can blame google for accepting the deal if the effect is bing, ddg and others cannot display reddit results without reaching some deal.
I'm saying the blame is 100% with Reddit.
Blaming Google for accepting it makes no sense. That's like if a shopper goes to grocery store and buys an expensive $20 piece of cheese, and other shoppers can't afford cheese that pricey, and you're blaming that one shopper for buying it because it means other shoppers can't also get the cheese without paying for it. That doesn't make any sense. The store set the price, and they're the one to blame if other shoppers can't afford it.
If Bing, DDG and others can't reach a deal with Reddit, that has nothing to do with Google.
Again, blame here is 100% on Reddit, and 0% on Google. To assign blame to a purchaser in a case like this doesn't make any sense.
i don't think google is blameless like you propose.
Reddit can charge smaller companies less money. So if there's a problem, again, the problem is 100% with Reddit.
Google is absolutely blameless here. You may not like Google, and you can certainly blame them for plenty of other things. But in this situation, literally all of the blame is with Reddit for deciding to remove their content from all search engines unless they pay. Reddit didn't have to do that. Google didn't make them do that.
Reddit did this. Not Google.
# Welcome to Reddit's robots.txt
# Reddit believes in an open internet, but not the misuse of public content.
# See https://support.reddithelp.com/hc/en-us/articles/26410290525844-Public-Content-Policy Reddit's Public Content Policy for access and use restrictions to Reddit content.
# See https://www.reddit.com/r/reddit4researchers/ for details on how Reddit continues to support research and non-commercial use.
# policy: https://support.reddithelp.com/hc/en-us/articles/26410290525844-Public-Content-Policy
User-agent: *
Disallow: /
Source: https://www.reddit.com/robots.txtYou can see it here: https://search.google.com/test/rich-results/result?id=_mYogl... (click on "View Tested Page")
Calling it "public" content in the very act of exercising their ownership over it. The balls on whoever wrote that.
so they can do whatever they want with it and the actual owners/authors have no chance to really influence Reddit at all to make it crawlable. (the GDPR-like data takeout is nice, but ... completely useless in these cases where the value is in the composition and aggregation with other users' content.)
https://old.reddit.com/r/redditdev/comments/1doc3pt/updating...
Of course, that became unsustainable so now I have everything behind a login wall.
I will never take a statement given by a company that blatantly lies like this at face value going forward. What a bunch of clowns.
This is a dangerous precedent for the internet. Business conglomerates have been controlling most of the web, but refusing basic interoperability is even worse.
If reddit had exclusive agreement, it would be anti-competive.
This is classic HN anti-Google tirade (and downvoting facts, logic and concepts of free market)
Yes, actually, there is - having $60m to throw around.
"Barriers to entry often cause or aid the existence of monopolies and oligopolies" [0]. Monopolies and oligopolies are definitionally the opposite of free market forces. This is quite literally Econ 101.
How many other sites might have leverage to charge to be indexed?
I don't want to live in a world where you have to use X search engine to get answers from Y site - but this seems like the beginning of that world.
From an efficiency perspective - it's obviously better for websites to just lease their data to search engines then both sides paying tons of bandwidth and compute to get that data onto search engines.
Realistically, there are only 2 search engines now.
This seems very bad for Kagi - but possibly could lead the old, cool, hobbiest & un-monetized web being reinvented?
https://www.ilga.gov/legislation/ilcs/ilcs4.asp?DocName=0720...
But I can understand why they made the change they did. The data was being abused.
My guess is that this was an oversight -- that they will do an audit and reopen it for search engines after those engines agree not to use the data for training, because let's face it, reddit is a for profit business and they have to protect their income streams.
Being forced into using google services, because they are paying information companies to deal only with them seems like a disaster for the web.
I'm not sure what to make of that.
It all depends of course what the market is. If one looks as reddit not as a whole but as a collection of niches then one could imho find niches where reddit has a dominant knowledge position.
An enterprise sales team with only 1 customer happens (eg, Mozilla 's search bar), but... That's surprising here, and scary as a sustainable & scalable business. Ignoring 5-6 figure/yr inquiries says a lot to me. In contrast, we did that same-day with Twitter without talking to anyone.
I wonder if this might affect redis, as in slowly kill it's user base especially when it comes to user providing (and often also looking for) high quality content, because who of such users would want to use google search?
I don't understand what you're saying. That's exactly why people append `site:reddit.com` to their searches in the first place, because those search results typically aren't like that.
The veracity of this statement is questionable.
I found at least four web search engines not using Google's index that produced results from the last week.
Example: Recent eruption at Yellowstone Black Diamond Pool
https://www.ecosia.org/search?method=index&q=site:reddit.com...
https://search.brave.com/search?q=reddit.com+black+diamond+p...
https://api.yep.com/fs/2/search?client=web&gl=all&no_correct...
POST /sp/search HTTP/1.0
host: www.startpage.com
content-length: 74
content-type: application/x-www-form-urlencoded
query=site:reddit.com black diamond pool&abp=-1&t=&lui=english&sc=&cat=web
At least for this example, I got the same desired result using Reddit site search.https://old.reddit.com/search/?q=black+diamond+pool
If anyone has some good examples of search queries that I can test showing why a search engine must be used, please share.
Google hasn't been a search engine in a long while, it's just an advertisement engine now.
it started when youtube removed the ability to search for videos older than 5 years, if I had to guess? cost saving, have every old video in cheaper storage... but it sort of fragments youtube, every couple of years you only get newer content.
It’s not like everyone wasn’t already pulling the same grift, but quantity really does have a quality all its own.
Capitalism seems to work ok for the common good until you remove all the protections. LLMs provide a defacto monopoly for the owner which must already be a near monopoly: they take vast resources to train; only a giant corp can afford to buy all the content and provision enough resources to train one.
LLM did not enshittify what's left of the internet, greed did it.
This is a very good point IMO. If we're going to chastise LLM's we may as well give servers, switches, routers, fiber-optic cable, and silicone a bollocking as well since that's ultimately what's facilitating all this.
And to me, forgetting to log in to each of them feels similar, too. For what that's worth. (I hate both of them when not logged in.)
AI companies like Google, Microsoft and OpenAI have deep pockets to 'unprotect' themselves from anything. The barrier to entry is for small AI companies and those aren't really making an impact currently.
Google really should blacklist reddit entirely for this practice, but sadly as bad as reddit is it's still a much higher quality result than average for google.
[0]https://www.reddit.com/r/ChatGPT/comments/133xgb5/gpt2_was_p...
edit:
> Realistically, there are only 2 search engines now.
https://seirdy.one/posts/2021/03/10/search-engines-with-own-...
From the article:
Many alternatives to GBY [Google, Bing, and Yandex] exist, but almost none of them have their own results;
This seems to assert that ~0 other search providers do any crawling at all. Ever. Are we sure that's the case? (they could crawl but never ever return those results == more odd).Scraper engine->validation/processing/cleanup->object storage->index + torrent serving is rough pipeline sketch.
[1] https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu... ("HN Search: annas archive")
[2] https://academictorrents.com/details/9c263fc85366c1ef8f5bb9d... ("AcademicTorrents: Reddit comments/submissions 2005-06 to 2023-12 [2.52TB]")
It's not the beginning, it's mere continuation.
Walled gardens have existed since the AOL days. They deteriorate over time but it doesn't prevent companies from trying (each time, in bigger attempts).
It still exists. It just isn't that popular.
There's an established player with institutional protections, then a scrappy upstart takes a bunch of VC money, converts it into runway, gives away the product for free, gradually replaces and becomes the standard, then puts out an s-1 document saying "we don't make money and we never have, want to invest?" and then they start to enjoy all the institutional protections. Or they don't. Either way you pay yourself handsomely from the runway money so who cares.
The upstart gets indexed and has an API, the established player doesn't.
The upstart is more easily found and modular but the institutional player can refuse to be indexed to own their data and they can block their API to prevent ai slop from getting in and dominating their content.
Depends how you see it - if you see it as 'their' data (legally true) or if you see it as user content (how their users would likely see it).
If you see it as 'user content', they are actually selling the data to be abused by one company, rather than stopping it being abused at all.
From a commercial 'lets sell user data and make a profit' perspective I get it, although does seem short-sighted to decide to effectively de-list yourself from alternative search engines (guess they just got enough cash to make it worth their while).
Is that actually true? Reddit may indeed have a license to use that data (derived from their ToS), but I very much doubt they actually own the copyright to it. If I write a comment on Reddit, then copy-paste it somewhere else, can Reddit sue me for copyright infringement?
Shared because it is unlikely Reddit responds except when required by law, so I recommend engaging regulators (FTC, and DOJ at the bare minimum) and legislators (primarily those focused on Section 230 reforms) whenever possible with regards to this entity. They're the only folks worth escalating to, as Reddit's incentives are to gate content, keep ad buyers happy, and keep the user base in check while they struggle to break even, sharing as little information publicly as possible along the way [1] [2].
[1] https://www.bloomberg.com/news/articles/2024-05-09/reddit-la... | https://archive.today/wQuKM
We thought it was an oversight too at first. It usually is. Large publishers have blocked us when they have not considered the details, but then reinstated us when we got in touch and explained.
you can always buy a competitor's or make your own vacuum cleaner if you hate buying at Walmart
maybe what you are really mad about is Reddit monopolising content
This has greatly enhanced my Google experience - easy to ban content farms, AI-generator websites from appearing in Google Images etc.
Kagi shill here. Are they finally applying filters and operands to image searches?
Asking because it was a tough year seeing Pinterest as top filter choice and top result in images (when set as filter=block).
(edit: I just tried searching->image: beautiful quilt patterns. I didn't spot any Pinterest results!)
I have never understood why DDG, etc steadfastly refuse to obey operands in image searches. Most days. Every blue moon operands seem to work. I think.
sidebar: Yesterday I saw Yandex obey quotes in a web search. It was the 1st time I've seen that.
Honestly, that makes those other engines way less valuable because for many topics, telling the engine to specifically narrow the results down to reddit comments is the only way to get a decent answer to what you're looking for. I'd definitely support blocking Quora from everything though.
I don't consider the discussions there to be "real" in any meaningful way, thanks to the extensive moderation.
From what I've seen, there typically ends up being a small handful of moderator-enforced narratives that are deemed "acceptable" for a given subreddit, and any commenters deviating from those narratives get banned, or their comments end up as "[removed]" by "[deleted]", or the comments get obscured with the "comment score below threshold" notice.
It's generally some of the most one-sided and blandest discussion around. Given that there's often no meaningful back-and-forth involving differing perspectives of any sort, I'm not even sure if it should be considered "discussion". It's more like regurgitation and repetition.
I've found the situation to be particularly bad on the Canadian locale-specific subreddits, for example, but a enough of the tech-oriented ones I've seen seem to end up like that, too.
Obviously, people do see value in it, or they wouldn't keep saying so! I would happily exclude Reddit links from search results though.
Not exactly always a reliable source of info outside uncontroversial niche topics or places like /r/AskHistorians that actually moderate. And even there I've seen the occasional humdinger.
Things are more than the sum of their parts. If you have a ton of beneficial things which can be cobbled together into one bad detrimental thing, the existence of the latter does not remove the benefits of the former.
But yeah, the most outrageous one is older videos, I do believe the reason is that they are using some long term cloud storage that is cheaper for older videos so they removed the ability to search by date.
Additionally, I don't believe the API fully fixes it, because Bing has a wrapper for youtube and searches do not really vary
No such thing, and definitely not the AP.
Similarly, if you start a vacuum cleaner company, you can make whatever exclusive deals you want. But if you control 80% of the market for vacuum cleaners, then you might need to be more careful about leveraging your market share in unfair ways.
If a company is part of a robust, competitive market (like Reddit), it's usually wiser to let customers vote with their wallets, and leave the government out of it. If a company becomes massively dominant (like Google or TicketMaster), and if it starts pushing exclusive contracts, it's much harder for customers to switch away.
You don’t even need >90% market share for that to be the case. e.g. Standard Oil only controlled 64% of the US market at most, it was still broken upz
Many web scraping companies have loads of phones hooked up in a rack in order to use mobile IPs. Companies can't just block mobile IPs because their site would become unusable for several city blocks (mobile IPs often correspond to a specific cell tower). This is the face of modern web scraping: https://i.imgur.com/U2RXi5G.jpeg
Eventually it becomes expensive to scrape reddit's data and most people will stop.
Even HN can get a licensing deal if they want to.
If you are producing content, you have every right to do what you want to with the content.
For practical purposes, reddit can do whatever they want with users post. It's right there in TOS
can also say every humans are leech benefiting off free software (creators) and complaining about their worthless chitchat, barely usable because of its basic semantics, being "stolen".
Of all the forums I used to be active in, many are still active. The ones that died did so because the community died (i.e. they did not shift to Reddit and the like).
Reddit is great simply because it allowed anyone to create a community. No need to get a LAMP stack and deal with security vulnerabilities in your forum SW.
These days you have Lemmy and its ilk. Much higher barrier than the old LAMP stack, but also much superior to it. I do hope it takes off.
They demand that right. That doesn't mean they actually have a right to use the content in ways that are not directly required for the operation of the website or that are otherwise surprising to the average user. Putting something in the TOS doesn't always make it a valid contract.
That was a bug, apologies. It should be fixed now.
Overall, my experience is very positive. I'm on many PCs throughout the day and I miss Kagi when it isn't there.
Note that scraping, regardless of the level of permission, doesn't mean you can do anything you want with the content. Copyright still applies. But you can scrape it, and if your use falls under Fair Use or another caveat to the copyright laws then you can do ahead and do it without needing any permission from the authors.
I haven’t read volume 1, but apparently half of it is about data scraping, and I expect it to be similarly detailed. So if I were you, that’s where I’d start.
Another option is looking for “robots.txt” at Google Scholar and trying various keywords like “legality”, “scraping”, “case law”, etc.
[1] https://free.law/recap/faq
(full disclosure: assisting someone pursuing regulatory action against reddit in the EU for a separate issue from scraping, it's a valuable resource, but the folks who own and control it are meh)
When I am answering some random dude on reddit with a problem I want that dude to read my solution. I don't want this to be crawled and forever stored (probably deanonymized) or enshrined in a dozen commercial LLMs. There is substack for that stuff.
In the end, I view potential AGI's as a common consciousness brainchild.
Is there a way to export my history? How?
(and there's some help article for it that I didn't read, but google found this first https://support.reddithelp.com/hc/en-us/articles/36004304835... that's how I got to the link)
Anyway, you basically submit a request and then later they will email you a link to a zip file that contains a free dozen CSV files with unescaped newlines. One for all the comments you made, one for up/down-votes, one for blocked users, etc.
It doesn't solve the problem, but if money is the only thing preventing search engines from accessing Reddit, then what goes for Google also goes for Microsoft.
That's a symptom of the issue, not a solution. Bing is used because having your own crawler is infeasible, partially because you will be literally blocked in many cases.
Monopolies are entirely consistent with free market economics. After all, if there's clearly a "best product" for a particular niche, it's entirely rational (free market actor) behavior for everyone to use the same product, leading to its monopoly in that market segment.
I don't understand why people think this isn't/won't be/shouldn't be a common result of "free market forces".
Not in the least. Literally in the first semester, Economics 101 type class that any business/economics/etc student would take, it would be covered clearly that monopolies are violations of free markets.
A free market isn't a euphemism for anarchy or "no rules", it's a specific economic term. The things it is free of include artificial price floors or ceilings, barriers to entry, anti-competitive practices, etc. In other words, monopolies, oligopolies, cartels, monopsonies, etc are all violations of a free market. You do not have a free market if there is a monopoly supplier.
If one competitor is far enough ahead of the rest, they can maintain that lead given that they can extract sufficient momentum from their early mover advantage. If they keep this up long enough, competitors never reach the scale to sufficiently prevent them from becoming a monopoly (at least over their local market segment).
None of this requires anti-competitive behavior; simply good execution on the part of the leader.
Unless, of course, you're suggesting that "free markets" also involve government intervention to suppress their lead in the market...
This is a fair critique. I'm approaching this from an admittedly American perspective in which "free market" colloquially implies competition - but I recognize that competition is not inherently a free market concept.
Good callout!
Having Barriers to Entry != Anti-competitive
Yes, large players have advantages of Economies of Scale.
Just because you can't run an Airline because you don't have money to buy an Airplane isn't anti-competitive.
Today Microsoft, Apple, OpenAI, Google, Amazon all can afford those piddly $60m to license from reddit.
Not Anti-competitive at all.
But saddened by how much corporate-hate by HNers destroys their credibility in debating these thing.
Go ahead downvote
"Because barriers to entry protect incumbent firms and restrict competition in a market, they can contribute to distortionary prices and are therefore most important when discussing antitrust policy."
Antitrust policy then links to a page on competition law: "Competition law is the field of law that promotes or seeks to maintain market competition by regulating anti-competitive conduct by companies." [0]
So yes, I'd downvote you if I could, but HN doesn't allow downvotes - which is honestly pretty fitting in the context of this conversation.
> If you need advanced blocking features, such as specifying rules by match patterns (e.g., ://.example.com/*) or by regular expressions (e.g., /example\.(net|org)/), the uBlacklist browser extension includes support for Kagi Search.
https://help.kagi.com/kagi/features/website-info-personalize...
I’m happy to continue this debate if you’d like to start supporting your posts with citations but probably won’t engage further unless you do. Have a great day!
That's helpful clarification.
In criticism of the article, you might agree that
none of them have their own results
is a fairly absolute statement. It signals: Final word on the matter; no nuance to follow.
I'm not taking their reporting without compensation, but that also means I didn't have the whole story. Such is life in this era of the internet.
https://www.crawlson.com/ https://search.marginalia.nu/ https://wiby.me/ https://searchmysite.net/
And Yandex isn't much better for non cyrillic search, Baidu is only for the Chinese web effectively.
And all other search engines either don't even attempt to do full web crawls anymore and/or buy from one of the four above.
So realistically there's just one search engine for the full web that actually does the work.
I hope that one day they get a western version
I like Yandex when I'm rabbit-holing after obscure musicians/music. I routinely have a better experience than I do with DDG or Kagi or Goog.
Endless iterations of the discussion about how copyright should/shouldn't be are meaningless without considering the larger social context.
The vast majority of creators (artists!?) are not compensated at all, and the users' (content consumers'? the public's?) attention is already 100% fully saturated.
So GenAI chrurning out more kitsch hardly changes that. (It got popular when it was novel, and ... that was it for now. And it became one more brush in the evergrowing workshop.)
And if/when a company creates a product out of it, then that product needs to be scrutinized and we should consider the ethical, social, and political problems. (Because ... politics is a blunt tool whereas ethics is as infinitely nuanced as we make it, whereas actual social considerations ought to be pragmatic and fair.)
And that's the problem. Users benefit from cheaper access to customized content. (In other words, arguably it's a net positive thing that they can - for example - ask some GenAI tool to make them a nice picture for their friend's birthday.) So what's the cost of this? Does it make some people jobless? Is that good or bad? (Is it good that plant breeding programs, fertilizers, tractors, and irrigation systems made a lot of agricultural workers jobless? Well, in some sense yes as allows a few people to feed many, freeing up time for others to become doctors and artists. In some sense bad, because our current socioeconomic system does not provide real social support - despite enormous redistribution of GDP. [Because brutal inefficiencies in allocation of that surplus. And that's again a political problem.])
... and of course here the status quo bias leads to "technological progress chipping away at inefficiencies" in a capitalist reality translates to "even more things get commoditized", and that in turn in the current shitty socioeconomic system equals "lots of externalities are not priced in, and lots of people are forced to drastically change their lives to adjust to new prices" (ie. price of their labor and/or products going down, so they need to change jobs, yet they barely get any support for doing so) ... and of course ethically it's bad that most people virtually uncritically accept and enjoy the results of progress (new products and services) without giving a fuck about the costs.
very arguably yeah. we evolve to where we push a button and get a birthday present. the logical conclusion is a system that does not even require to push a button. this is the opposite of the original point (spend time and effort making something to show you care). so to me it is not benefit, it is harm.
maybe you picked a bad example. but the examples of legit benefits of this tech, they don't turn into harms if you really analyze it and go to the root, are hard to find in consumer land.
So by spending the same 20 minutes fiddling with some kind of graphics if the outcome becomes 10x better then I see that as an advantage. (That said I haven't tried this.)
Basically this is "exactly" that kind of capability that we already seen in "artist paints kid's drawing" [0] just commoditized. And this is where I (as a user) would be looking for the style transfer. Because I saw something famous/trendy/fancy/aesthetically-pleasing and now want to imitate it as a gag for this hypothetical birthday card. (Sure, copyright already doesn't care if I download someone's famous photo, and cut my friends head out from a photo I made, and put the head part on the copyrighted photo. Yet it was a straightforward derivative work. Just it's not commercial, damages are none, or even negative, ie. the artist benefits from that famous exposure! :D)
> legit benefits of this tech
I wanted to try to find an example that could apply to a lot of users. Because as awesome as using GenAI to generate Lean proofs (and then add a feedback loop by actually running the generated code through Lean) to solve Math Olympiad problems [1], it's not really an everyday thing.
> we evolve to where we push a button and get a birthday present. the logical conclusion is a system that does not even require to push a button.
Well, maybe! In no time we'll board the Axios and just chill. [2] But at the same time crafts and DIY and experiences (tourism, festivals, concerts) are ridiculously popular. (And it has its own problems. [3])
[0] https://www.youtube.com/watch?v=dB-Q0eNsUaQ KID'S ART Redrawn by a PROFESSIONAL ARTIST! - Ep.6
[1] https://news.ycombinator.com/item?id=41069829
[2] https://www.thelist.com/img/gallery/things-only-adults-notic...
[3] https://edition.cnn.com/2024/07/08/travel/barcelona-tourism-...
Except that isn't what happened. Within the context, 'almost none' referred to size of the group, not to the amount of crawling.
I was discussing how much crawling is being done - outside of the 'almost none' sized group.
The 'almost none' sized group has 'their own results', so they do crawl. Based on that, the rest do not. Ergo, they do not crawl.
A search engine that never crawls seems non-intuitive.
I've no reason to doubt this so I'll attribute it to which results get pushed higher in the rankings. Different entities hold power in the E & W hemispheres and that can influence search results to a degree.
Past that, there are some things I get at Yandex because Kagi seems incapable of searching for them. ex: Long strings like MD5 and larger hashes.
edit:30 seconds after posting I searched a hash and Kagi sent back a ton of results. As recently as last week I got zero results (diff hash).
That's the 2nd time in a week Kagi disproved a complaint of mine. I hope they continue to do so.
The myopic pursuit of short-term gains is the only playbook that works. Long-term business strategy is a gamble, and today's businesses have all learned that they'd rather make hay when the sun is shining than be remembered as a good business.
Twitter tried a long-term playbook to reverse their unprofitable sinkhole of a website. That ended up with them being undervalued and sold to the highest bidder.
From what I recall reading at the time, Twitter was finally becoming profitable before the sale (last two years? It’s hard to find a source now since every story since is about some shit show or other post sale).
> That ended up with them being undervalued and sold to the highest bidder.
You make it seem they were in dire straits and had to be sold for scraps, but that’s far from the case. They sold for more than their valuation to the only bidder because they understood what a good deal it was for them. They forced the buyer to not back out, after all.
Those websites were definitely technically inferior, as the march of progress is unavoidable, but web hosting is cheaper than it's ever been. A VPS that utterly blows away what mine was capable of in 2007 for nearly a hundred a month can now be had for about $10 per month. Yet everyone wants these monolith platforms, but even that wouldn't be the worst thing ever, except that every one of these platforms has a backend to support that we in the Old Internet never did: a C-suite's worth of executives and millions of shareholders, who for some reason have decided that reddit can't exist unless reddit makes them reams and reams of money.
I'd be very, very interested to see how much of, even what's probably the most massive one of all, Facebook, is non-essential busywork that could easily be shut down tomorrow with no adverse effects to the platform. Firstly the entire executive class, just, they don't do shit to make Facebook the product. In fact I'd argue their decisions almost universally have made it worse as a product very consistently for it's entire lifetime. Then, all the marketing people. There's just no goddamn reason to advertise Facebook (or reddit for that matter) the brand is so ubiquitous, if you actually found someone who'd never heard of it, I'd give you a large chunk of money. Add to that, if Facebook was doing a good job of being what it ostensibly is, then people immediately become the best advertising, because people want to hang with people in these digital spaces. Then get rid of the people working to make Facebook addictive with dark patterns. Then get rid of the entire targeted ad division, because it's gross and inhumane. Pare the company down to engineers who build the product, and if anything, expand the moderation team so they can actually ensure the safety of the platform, and of course the IT staff to back them. Now what does Facebook cost to operate?
As far as I'm concerned, this pearl-clutching about "well websites have to make money" is grossly, grossly overstated. Websites don't cost that much to run. A ton of money is being siphoned off by the MBA parasites playing games in Excel all day. A ton more is being wasted developing features that advertisers want and users hate. A ton more is being funneled into making products artificially addictive to vulnerable people, to exploit them, so let's just not do that. And of course, leadership, rewarding themselves with generous compensation packages they aren't even remotely able to justify. Now what does your website cost to maintain? Surely not nothing, and for websites of substantial size, it will still be high, but I'm willing to bet it's a hell, hell, hell of a lot less than it was before.
The silicon valley "grow at all costs, always evolve and innovate forever" model is so detached from the reality of most businesses in my experience.
Popular websites that allow user content to be uploaded or linked do cost that much to run, due to content moderation.
There might be a small (relatively) forum here and there that a few good moderators are willing to slave away at keeping clean, but you will never see a website that allows user content with as many users as Reddit/Youtube/Instagram/etc be cheap.
Although, due to AI, the cost to spam the small forums might be so small that even they might come into the crosshairs.
Reddit outsourced most of it's moderation to unpaid volunteers.
do you have any reference for your claim?
i use brave search and find it very useful. very rarely there is something i can't find, and when i run into that other search engines are not much better.
Instead the wording leaves wiggle room for the possibility of using multiple.
This was sarcasm. That system would never happen. If birthday card requires no effort it is worse than nothing, literal noise.
> So by spending the same 20 minutes fiddling with some kind of graphics if the outcome becomes 10x better then I see that as an advantage. (That said I haven't tried this.)
Feels like you misunderstood. It is not "10x better" or "advantage" because literally the easier it was for you the less it is valuable.
... why? The closer I can approximate my imagined output (in a fixed time) the more value I see in it. The tool provides the added value.
We have better cameras than 30 years ago. Better pictures. It's easier to make a better photo in the same amount of time. I would be happier if my old blurry photos would be better.
Value is relative, of course, that's why exchange/trade works, right? And the same applies here too, with the added complications of explicitly "unknowable price", after all I don't know apriori how much value my friend will assign to the card!
In general there are many situations that map closer to your understanding than mine, but I think you are dismissing also a lot of perfectly valid ones where your model simply doesn't apply.
For example if said friend values effort, but "apparent concept vs. actual results fidelity" is not part of effort for them.
Yet still we don't know if they value effort as in time spent (because if they value time spent then it's the same value in both of my scenarios) or if they value some "visible handcraftedness" (effort to produce something that I am known to be very amateur at). And even then the question arises, isn't picking the best tool for the job itself effort?
Of course if this friend simply abbors GenAI shit then they will think it's "low effort" even if I put hours and hours into it (that I wouldn't have put into a traditional image manipulation software, because there I realized soon that I would need months of learning, etc.)
I hadn't really thought about that topic in that way before. Really explains why some of those older MMOs have no desire to really make any improvements, the owners are happy to just keep them powered up and collect a check but have no incentive to invest in making them better.
I agree, but also the flip side is that things rapidly switch from 'done and working' to 'dead' pretty quickly if no one is willing to do minor maintenance.