Training our own AI models

JimDabell 1 hour ago |

“Opt-in by default” is an oxymoron. If it’s default then I haven’t opted into anything. It’s been enabled by default.

xnorswap 1 hour ago | |

This frustrates me too, if something is "opt-in", that means by default you're not included and can choose to be included. If something is "opt-out", that means you're included and can choose not to be.

But then it gets used to describe the reverse, and we have to add words to clarify.

I once saw a post here with a correctly described opt-in telemetry before, and the top comment here was attacking them for the reverse, thinking it was including them by default, so there's little winning, it's one of those words that has just come to mean it's opposite.

irishcoffee 1 hour ago | |

You were given the option to option in, by default. Clearly it makes sense, optioned out by default only happens when someone loses money on the option in default instead.

The internet really stinks. The 1999 teenager in me somewhere is really bummed.

deflator 1 hour ago | |

Very true. I was considering PostHog, but this sours them in my eyes. Very deceptive wording.

skrebbel 1 hour ago | | |

They could’ve done a lot worse, and most companies would’ve.

deflator 1 hour ago | | |

There's a helpful response. It could always be worse!

mannanj 1 hour ago | |

Isn't it kind of like mandatory tip? If you haven't given it voluntarily, i.e .its automatically opted-in and you maybe can't even not give it. its the same.

abustamam 1 hour ago | | |

Many restaurants have the audacity to add a 20% (or whatever percentage) "service fee" that isn't considered tip. It even says something like "we use this to pay our staff competitive wages and health insurance." You can't opt out. It's just part of the bill. Then they have the gall to ask for a tip on top of that.

I've taken to a) leaving a negative Google or yelp review for such establishments and b) never coming back. This is a practice that needs to die.

rectang 53 minutes ago | | |

Do you leave a negative review if they add the service charge but don't ask for a tip?

abustamam 51 minutes ago | | |

I've never had the pleasure of encountering that situation.

But at what point do we call a spade a spade and say it's just them secretly inflating their prices? "everything is a penny but we charge a 1000000% service charge"

rectang 49 minutes ago | | |

I used to wait tables once upon a time and it was standard practice to add a fixed service charge for any large party in lieu of a tip. Have you really never encountered that?

croes 1 hour ago | | |

Opt-in by default means it is either mandatory (if you can‘t disable it) or it‘s opt-out (if you can) Opt-In by default is BS to make it sound less invasive

abustamam 1 hour ago | | |

Imagine if they said paying taxes were opt-in by default. No, it's mandatory! Sure you can technically not pay taxes but you won't have a good time.

_heimdall 43 minutes ago | | |

That is slightly different though. The government says you must pay taxes, mandatory as you said.

PostHog here is saying they will train on your data but opting out is allowed. For the taxes analogy to work, PostHoh would not offer the opt-out option at all and you'd be doing something like hacking their system to filter your data out on their end.

Waterluvian 1 hour ago |

PostHog was a system we set up once, generally don't think about, and review from time to time, providing some occasional value. It was mostly harmless to leave around.

But it's apparently yet one more thing we have to be actively suspicious of as it defaults towards an intolerable state. So it's easier to just rip it out of the system and move on.

sixtyj 2 hours ago |

Most companies would bury this change in a deceptively boring T&Cs update, but we value transparency, so here's what you need to know in an internet-friendly numbered list:

Users on our EU cloud instance are opted out by default

So too users with agreements that prevent training (e.g. BAA, MSA, or similar)

All other users on our US cloud instance are opted in by default

We will anonymize all data before it's used for training

We will only use data that already exists in your PostHog instance

We will do all the model training ourselves, which means...

We won't sell or send your data to third-party model providers

You can opt out at any time via your org settings in PostHog (admin access required)

Training won't start until June 29, so there's plenty of time to decide

Dave_Rosenthal 38 minutes ago |

They say, "our goal here is to improve PostHog as a product for our customers, not to expose or sell models trained on your data" but then don't actually list that as a limitation in the bulleted points.

AFAICT this now gives them default permission to train an LLM on your code (as Posthog telemetry data is inextricably tied to your code) use it, and even sell it if they wanted to (as it's not your data anymore, it's their model). Yikes.

thecatapps 49 minutes ago |

It's probably very obvious by now, but there's something to be said about companies with the "SF Quirky" vibes:

- The OS Redesign

- "Sexy Legal Documents"

- Emails with "<relevant hedgehog meme goes here>" as the subject line

- Having a merch shop with action figures of your CEO

It works both ways. When you're looking for adoption and making very pro-user moves, I guess it can be a benefit. However, when you're now looking to grow revenue and making very anti-user moves, it's insult to injury.

I'm the last person to say that tech "shouldn't be fun" or something overly-broad like that, but if your messaging doesn't match the decisions of leadership, you're gonna have a bad time.

frankest 1 hour ago |

What a great reminder to build my own analytics and self host. PostHog just lost a customer. They could easily send a email to each customer asking if we want this. The assumption means they have no product intuition about their own customers, let alone the customers of their customers. Bye.

xrd 1 hour ago | |

Not trying to be snarky but why not just opt out instead of vibe coding your own analytics platform? I'm uncomfortable with people using my data to train AI, but those concerns revolve around where my data goes, and whether I'm notified/aware. Posthog is giving me good answers to those questions here.

frankest 1 hour ago | | |

It has to do with the priorities of the company and its leadership. Either they lack the basic awareness to know that training on your business customers data will likely leak their sensitive information to their competitors, or they just intend to sell that data. We are not paying to have our data stolen.

xrd 1 hour ago | | |

Very fair point!

infecto 1 hour ago |

Thanks for posting. I had been in the fence for the past few months of switching. The new AI products combined with the weird UIs had been irking me for a while. This is the final nail in the coffin. Opt-in is a terrible business model imo.

thecatapps 1 hour ago | |

Agreed. While I don't entirely care enough to rip it out of any existing products, I certainly won't be adding it to any new ones.

I remember people cheering about their "OS" web redesign, which was the most confusing and unnecessary UX complication when I needed to go track down a session replay to debug something (They've since added navigation to the top right.)

tines 1 hour ago |

“Opt-in by default” = opt-out?

Tsarp 1 hour ago | |

Guess its "Opted-in" by default

natch 1 hour ago | | |

Then it’s not opted. It’s just in.

patates 1 hour ago | | |

"Possible to somewhat disable", I call it "PTSD".

mrits 1 hour ago | |

Opt means to make a choice or select an alternative. They are either incompetent or lying on purpose.

brauhaus 1 hour ago |

Every day I'm more glad about EU legislation, that's all I have to say for now

gobdovan 1 hour ago | |

Yeah, the legislation is morally defensible on its own terms. But when you look at the full system, something funny happens: EU legislation is blocking data extraction and platform lock-in tactics that Big Tech already used to become monopolies.

And since the big platforms don't have to unwind their advantages or pay back for the methods that are now restricted and considered illegal, they can peacefully extract rents from their entrenched positions for even longer, while everyone else is prevented from using the same ladder they climbed.

vovavili 1 hour ago | |

...until you learn the rates of economic development between Europe and the US since 2008.

Laurel1234 47 minutes ago | | |

Every last single cent of that "economic development" is in the hands of billionaires, at least people in Europe have rights and their government isn't a couple of monopolies in a trenchcoat.

abustamam 57 minutes ago |

> Why this is opt out, not opt in

> Put simply, because otherwise we will not have enough data to train a model that's actually useful.

AKA we won't be able to make as much money if we required you to give us permission to use your data.

rad_val 34 minutes ago |

All of them do if you don't do something about it(e.g. migrate to self hosted solutions), trusting a ToS in 2026 is as naive as it gets.

freshnode 1 hour ago |

Why won't companies explain what anonymisation means for them?

Posthog has unfettered logged in access to some sensitive stuff. What steps are they actually taking to scrub sensitive data from my replay before being used to train a model?

tartieret 54 minutes ago | |

this is what triggered my post. The announcement pretends that it's not bug deal because of "anonymization" but that's easier said than done. You can send custom events and logs that contain confidential information even if it doesn't contain personal identifiers

the__alchemist 1 hour ago |

How much are they paying the users?

stevoski 28 minutes ago |

I’ve been evaluating PostHog for our company.

I’ve now made our decision. We won’t be using them.

If they are going to position yourself as the non-slimy no-BS guys, they can’t pull this nonsense.

ASinclair 1 hour ago |

Mostly unrelated but the name of this company makes me think it's a Dick-Pics-as-a-Service provider.

lljk_kennedy 1 hour ago | |

netdix.com

mrcwinn 1 hour ago |

Gross.

They’ll use your product and your data to later sell a product back to you.

xp84 36 minutes ago | |

Even if there were no AI, that's not any different than any SaaS where your data gets stored. Picking at random, Optimizely certainly has a ton of interaction data available and they build new features and products that leverage your data (without which the features would be impossible). Could be reporting tools, funnel analysis, etc.

gyoridavid 55 minutes ago |

I feel that the US should step up their legislation game and make sure these companies can't retroactively make rules to steal their users data. I know it's trendy to hate the EU but their legislation actually protects the users, and not the companies interests.

jen20 1 hour ago |

Perhaps if they hopped on a quick call for five minutes with some customers, they'd realize quite how little appetite there is for putting up with being opted into things automatically in the US but not in the EU.

As an aside, this also means the EU rules are working.

freshnode 1 hour ago | |

+1 this made me glad we opted for the EU region

bigstrat2003 1 hour ago |

This is the fastest way possible to ensure I will never do business with you, or stop doing business with you if I already am.

tartieret 2 hours ago |

I initially used Posthog as an alternative to Google Analytics with more privacy. Now they want to use the data for a business purpose. Working hard towards enshitification?

rvz 2 hours ago | |

> I initially used Posthog as an alternative to Google Analytics with more privacy.

This does not make any sense.

> Now they want to use the data for a business purpose.

They raised VC money and they want a return so this was predictable.

mrits 1 hour ago | | |

It makes perfect sense actually

calmbonsai 1 hour ago |

LOL. You stay classy PostHog.

Henchman21 1 hour ago |

You can’t “opt-in” to something that is the default. The choice is made for you — and when the choice is made for you? You haven’t opted in or out?

scosman 1 hour ago | |

I would have guessed that was just a bad title here but no, article states it as "opted in by default".

tartieret 1 hour ago | | |

I fixed the title, sorry for the typo!

scosman 42 minutes ago | | |

not your fault, the article uses that language!

dzonga 57 minutes ago |

another would be excellent product company destroyed or being destroyed slowly due to VCs and the ever chase for 'growth'

mikkelam 57 minutes ago |

The enshittification has begun. Time to move on!

TZubiri 1 hour ago |

Today I was thinking, if I start a company in the LLM tooling space, I would put in the company mission in the incorporation documents that client data will not be used to train.

The temptation and the value is too great, and the opt-in opt-out consent thing ends up being a fuckery where the company tries to trick the user into allowing them to take a look into the data, presumably because they are selling the product at a loss and need an alternative revenue model.

Just make it impossible from the get-go, the fine print would be that the data can be shared off-band explicitly, in an email, or if explicitly copy pasted in a support chatbox, but there would be no mechanism for us to read the data from the databases much less from the client.

I don't mean it would be an air-tight mechanism like Signal or ProtonMail, if a court order would ask us to produce client info, we would still reserve the right to produce the data, but exceptionally, and definitely not for training models.

OkayPhysicist 1 hour ago | |

More companies need to make, for lack of a better term, "oaths" of what they won't do as a company. My pitch on it is to tie it to financial penalties the company agrees to pay, somewhere in the "enough to incentivize a significant portion of our user base to sue us" territory, such that it would be financial suicide to violate them.

TZubiri 1 hour ago | | |

Contracts ad incorporations are designed for this, the issue is that the incumbent legal strategy is to use template documents, and to reduce potential disputes to 1$ in private arbitration, essentially legal's job is to make legal go away.

Another term I would incorporate is a Seppuku term, if we get hacked, I resign, the company goes bankrupt. Anything else is the wrong attitude to computer security for companies that want to scale to Global reach.